linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/9] restart_block: Prepare the ground for dumping timeout
@ 2019-09-09 10:23 Dmitry Safonov
  2019-09-09 10:23 ` [PATCH 1/9] futex: Remove unused uaddr2 in restart_block Dmitry Safonov
                   ` (8 more replies)
  0 siblings, 9 replies; 19+ messages in thread
From: Dmitry Safonov @ 2019-09-09 10:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Dmitry Safonov, Dmitry Safonov, Adrian Reber, Alexander Viro,
	Andrei Vagin, Andy Lutomirski, Cyrill Gorcunov, Ingo Molnar,
	Oleg Nesterov, Pavel Emelyanov, Thomas Gleixner, containers,
	linux-fsdevel

Hi,

I'm trying to address an issue in CRIU (Checkpoint Restore In Userspace)
about timed syscalls restart. It's not possible to use restart_syscall()
as the majority of applications does, as after restore the kernel doesn't
know anything about a syscall that may have been interrupted on
checkpoint. That's because the tasks are re-created from scratch and so
there isn't task_struct::restart_block set on a new task.

As a preparation, unify timeouts for different syscalls in
restart_block.

On contrary, I'm struggling with patches that introduce the new ptrace()
request API. I'll speak about difficulties of designing new ptrace
operation on Containers Microconference at Plumbers [with a hope to
find the sensible solution].

Cc: Adrian Reber <adrian@lisas.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andrei Vagin <avagin@openvz.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: containers@lists.linux-foundation.org
Cc: linux-fsdevel@vger.kernel.org

Dmitry Safonov (9):
  futex: Remove unused uaddr2 in restart_block
  restart_block: Prevent userspace set part of the block
  select: Convert __esimate_accuracy() to ktime_t
  select: Micro-optimise __estimate_accuracy()
  select: Convert select_estimate_accuracy() to take ktime_t
  select: Extract common code into do_sys_ppoll()
  select: Use ktime_t in do_sys_poll() and do_poll()
  select/restart_block: Convert poll's timeout to u64
  restart_block: Make common timeout

 fs/eventpoll.c                 |   4 +-
 fs/select.c                    | 214 ++++++++++++---------------------
 include/linux/poll.h           |   2 +-
 include/linux/restart_block.h  |  11 +-
 kernel/futex.c                 |  14 +--
 kernel/time/alarmtimer.c       |   6 +-
 kernel/time/hrtimer.c          |  14 ++-
 kernel/time/posix-cpu-timers.c |  10 +-
 kernel/time/posix-stubs.c      |   8 +-
 kernel/time/posix-timers.c     |   8 +-
 10 files changed, 115 insertions(+), 176 deletions(-)

-- 
2.23.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/9] futex: Remove unused uaddr2 in restart_block
  2019-09-09 10:23 [PATCH 0/9] restart_block: Prepare the ground for dumping timeout Dmitry Safonov
@ 2019-09-09 10:23 ` Dmitry Safonov
  2019-09-09 10:23 ` [PATCH 2/9] restart_block: Prevent userspace set part of the block Dmitry Safonov
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Dmitry Safonov @ 2019-09-09 10:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Dmitry Safonov, Dmitry Safonov, Adrian Reber, Alexander Viro,
	Andrei Vagin, Andy Lutomirski, Cyrill Gorcunov, Ingo Molnar,
	Oleg Nesterov, Pavel Emelyanov, Thomas Gleixner, containers,
	linux-fsdevel

Not used since introduction in commit 52400ba94675 ("futex: add
requeue_pi functionality").
The result union stays the same size, so nothing saved in task_struct,
but still one __user pointer less to keep.

Signed-off-by: Dmitry Safonov <dima@arista.com>
---
 include/linux/restart_block.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/include/linux/restart_block.h b/include/linux/restart_block.h
index bba2920e9c05..e5078cae5567 100644
--- a/include/linux/restart_block.h
+++ b/include/linux/restart_block.h
@@ -32,7 +32,6 @@ struct restart_block {
 			u32 flags;
 			u32 bitset;
 			u64 time;
-			u32 __user *uaddr2;
 		} futex;
 		/* For nanosleep */
 		struct {
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/9] restart_block: Prevent userspace set part of the block
  2019-09-09 10:23 [PATCH 0/9] restart_block: Prepare the ground for dumping timeout Dmitry Safonov
  2019-09-09 10:23 ` [PATCH 1/9] futex: Remove unused uaddr2 in restart_block Dmitry Safonov
@ 2019-09-09 10:23 ` Dmitry Safonov
  2019-09-09 10:23 ` [PATCH 3/9] select: Convert __esimate_accuracy() to ktime_t Dmitry Safonov
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Dmitry Safonov @ 2019-09-09 10:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Dmitry Safonov, Dmitry Safonov, Adrian Reber, Alexander Viro,
	Andrei Vagin, Andy Lutomirski, Cyrill Gorcunov, Ingo Molnar,
	Oleg Nesterov, Pavel Emelyanov, Thomas Gleixner, containers,
	linux-fsdevel

Parameters for nanosleep() could be chosen the way to make
hrtimer_nanosleep() fail. In that case changes to restarter_block bring
it into inconsistent state. Luckily, it won't corrupt anything critical
for poll() or futex(). But as it's not evident that userspace may do
tricks in the union changing restart_block for other @fs(s) - than
further changes in the code may create a potential local vulnerability.

I.e., if userspace could do tricks with poll() or futex() than
corruption to @clockid or @type would trigger BUG() in timer code.

Set @fn every time restart_block is changed, preventing surprises.
Also, add a comment for any new restart_block user.

Signed-off-by: Dmitry Safonov <dima@arista.com>
---
 include/linux/restart_block.h  | 4 ++++
 kernel/time/hrtimer.c          | 8 +++++---
 kernel/time/posix-cpu-timers.c | 6 +++---
 kernel/time/posix-stubs.c      | 8 +++++---
 kernel/time/posix-timers.c     | 8 +++++---
 5 files changed, 22 insertions(+), 12 deletions(-)

diff --git a/include/linux/restart_block.h b/include/linux/restart_block.h
index e5078cae5567..e66e982105f4 100644
--- a/include/linux/restart_block.h
+++ b/include/linux/restart_block.h
@@ -21,6 +21,10 @@ enum timespec_type {
 
 /*
  * System call restart block.
+ *
+ * Safety rule: if you change anything inside @restart_block,
+ * set @fn to keep the structure in consistent state and prevent
+ * userspace tricks in the union.
  */
 struct restart_block {
 	long (*fn)(struct restart_block *);
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 5ee77f1a8a92..4ba2b50d068f 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -1762,8 +1762,9 @@ SYSCALL_DEFINE2(nanosleep, struct __kernel_timespec __user *, rqtp,
 	if (!timespec64_valid(&tu))
 		return -EINVAL;
 
-	current->restart_block.nanosleep.type = rmtp ? TT_NATIVE : TT_NONE;
-	current->restart_block.nanosleep.rmtp = rmtp;
+	current->restart_block.fn		= do_no_restart_syscall;
+	current->restart_block.nanosleep.type	= rmtp ? TT_NATIVE : TT_NONE;
+	current->restart_block.nanosleep.rmtp	= rmtp;
 	return hrtimer_nanosleep(&tu, HRTIMER_MODE_REL, CLOCK_MONOTONIC);
 }
 
@@ -1782,7 +1783,8 @@ SYSCALL_DEFINE2(nanosleep_time32, struct old_timespec32 __user *, rqtp,
 	if (!timespec64_valid(&tu))
 		return -EINVAL;
 
-	current->restart_block.nanosleep.type = rmtp ? TT_COMPAT : TT_NONE;
+	current->restart_block.fn		= do_no_restart_syscall;
+	current->restart_block.nanosleep.type	= rmtp ? TT_COMPAT : TT_NONE;
 	current->restart_block.nanosleep.compat_rmtp = rmtp;
 	return hrtimer_nanosleep(&tu, HRTIMER_MODE_REL, CLOCK_MONOTONIC);
 }
diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c
index 0a426f4e3125..b4dddf74dd15 100644
--- a/kernel/time/posix-cpu-timers.c
+++ b/kernel/time/posix-cpu-timers.c
@@ -1243,6 +1243,8 @@ void set_process_cpu_timer(struct task_struct *tsk, unsigned int clock_idx,
 	tick_dep_set_signal(tsk->signal, TICK_DEP_BIT_POSIX_TIMER);
 }
 
+static long posix_cpu_nsleep_restart(struct restart_block *restart_block);
+
 static int do_cpu_nanosleep(const clockid_t which_clock, int flags,
 			    const struct timespec64 *rqtp)
 {
@@ -1330,6 +1332,7 @@ static int do_cpu_nanosleep(const clockid_t which_clock, int flags,
 		 * Report back to the user the time still remaining.
 		 */
 		restart = &current->restart_block;
+		restart->fn = posix_cpu_nsleep_restart;
 		restart->nanosleep.expires = expires;
 		if (restart->nanosleep.type != TT_NONE)
 			error = nanosleep_copyout(restart, &it.it_value);
@@ -1338,8 +1341,6 @@ static int do_cpu_nanosleep(const clockid_t which_clock, int flags,
 	return error;
 }
 
-static long posix_cpu_nsleep_restart(struct restart_block *restart_block);
-
 static int posix_cpu_nsleep(const clockid_t which_clock, int flags,
 			    const struct timespec64 *rqtp)
 {
@@ -1361,7 +1362,6 @@ static int posix_cpu_nsleep(const clockid_t which_clock, int flags,
 		if (flags & TIMER_ABSTIME)
 			return -ERESTARTNOHAND;
 
-		restart_block->fn = posix_cpu_nsleep_restart;
 		restart_block->nanosleep.clockid = which_clock;
 	}
 	return error;
diff --git a/kernel/time/posix-stubs.c b/kernel/time/posix-stubs.c
index 67df65f887ac..d73039a9ca8f 100644
--- a/kernel/time/posix-stubs.c
+++ b/kernel/time/posix-stubs.c
@@ -142,8 +142,9 @@ SYSCALL_DEFINE4(clock_nanosleep, const clockid_t, which_clock, int, flags,
 		return -EINVAL;
 	if (flags & TIMER_ABSTIME)
 		rmtp = NULL;
-	current->restart_block.nanosleep.type = rmtp ? TT_NATIVE : TT_NONE;
-	current->restart_block.nanosleep.rmtp = rmtp;
+	current->restart_block.fn		= do_no_restart_syscall;
+	current->restart_block.nanosleep.type	= rmtp ? TT_NATIVE : TT_NONE;
+	current->restart_block.nanosleep.rmtp	= rmtp;
 	return hrtimer_nanosleep(&t, flags & TIMER_ABSTIME ?
 				 HRTIMER_MODE_ABS : HRTIMER_MODE_REL,
 				 which_clock);
@@ -228,7 +229,8 @@ SYSCALL_DEFINE4(clock_nanosleep_time32, clockid_t, which_clock, int, flags,
 		return -EINVAL;
 	if (flags & TIMER_ABSTIME)
 		rmtp = NULL;
-	current->restart_block.nanosleep.type = rmtp ? TT_COMPAT : TT_NONE;
+	current->restart_block.fn		= do_no_restart_syscall;
+	current->restart_block.nanosleep.type	= rmtp ? TT_COMPAT : TT_NONE;
 	current->restart_block.nanosleep.compat_rmtp = rmtp;
 	return hrtimer_nanosleep(&t, flags & TIMER_ABSTIME ?
 				 HRTIMER_MODE_ABS : HRTIMER_MODE_REL,
diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
index d7f2d91acdac..0ca0bfc20aff 100644
--- a/kernel/time/posix-timers.c
+++ b/kernel/time/posix-timers.c
@@ -1189,8 +1189,9 @@ SYSCALL_DEFINE4(clock_nanosleep, const clockid_t, which_clock, int, flags,
 		return -EINVAL;
 	if (flags & TIMER_ABSTIME)
 		rmtp = NULL;
-	current->restart_block.nanosleep.type = rmtp ? TT_NATIVE : TT_NONE;
-	current->restart_block.nanosleep.rmtp = rmtp;
+	current->restart_block.fn		= do_no_restart_syscall;
+	current->restart_block.nanosleep.type	= rmtp ? TT_NATIVE : TT_NONE;
+	current->restart_block.nanosleep.rmtp	= rmtp;
 
 	return kc->nsleep(which_clock, flags, &t);
 }
@@ -1216,7 +1217,8 @@ SYSCALL_DEFINE4(clock_nanosleep_time32, clockid_t, which_clock, int, flags,
 		return -EINVAL;
 	if (flags & TIMER_ABSTIME)
 		rmtp = NULL;
-	current->restart_block.nanosleep.type = rmtp ? TT_COMPAT : TT_NONE;
+	current->restart_block.fn		= do_no_restart_syscall;
+	current->restart_block.nanosleep.type	= rmtp ? TT_COMPAT : TT_NONE;
 	current->restart_block.nanosleep.compat_rmtp = rmtp;
 
 	return kc->nsleep(which_clock, flags, &t);
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/9] select: Convert __esimate_accuracy() to ktime_t
  2019-09-09 10:23 [PATCH 0/9] restart_block: Prepare the ground for dumping timeout Dmitry Safonov
  2019-09-09 10:23 ` [PATCH 1/9] futex: Remove unused uaddr2 in restart_block Dmitry Safonov
  2019-09-09 10:23 ` [PATCH 2/9] restart_block: Prevent userspace set part of the block Dmitry Safonov
@ 2019-09-09 10:23 ` Dmitry Safonov
  2019-09-09 10:23 ` [PATCH 4/9] select: Micro-optimise __estimate_accuracy() Dmitry Safonov
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Dmitry Safonov @ 2019-09-09 10:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Dmitry Safonov, Dmitry Safonov, Adrian Reber, Alexander Viro,
	Andrei Vagin, Andy Lutomirski, Cyrill Gorcunov, Ingo Molnar,
	Oleg Nesterov, Pavel Emelyanov, Thomas Gleixner, containers,
	linux-fsdevel

__estimate_accuracy() divides 64-bit integers twice which is suboptimal.
Converting to ktime_t not only avoids that, but also simplifies the
logic on some extent.

The long-term goal is to convert poll() to leave timeout value in
ktime_t inside restart_block as it's the only user that leaves it in
timespec. That's a preparation ground for introducing a new ptrace()
request that will dump timeout for interrupted syscall.

Furthermore, do_select() and do_poll() actually both need time in
ktime_t for poll_schedule_timeout(), so there is this hack that converts
time on the first loop. It's not only a "hack", but also it's done every
time poll() syscall is restarted. After conversion it'll be removed.

While at it, rename parameters "slack" and "timeout" which describe
their purpose better.

Signed-off-by: Dmitry Safonov <dima@arista.com>
---
 fs/select.c | 33 +++++++++++++--------------------
 1 file changed, 13 insertions(+), 20 deletions(-)

diff --git a/fs/select.c b/fs/select.c
index 53a0c149f528..12cdefd3be2d 100644
--- a/fs/select.c
+++ b/fs/select.c
@@ -36,7 +36,7 @@
 
 
 /*
- * Estimate expected accuracy in ns from a timeval.
+ * Estimate expected accuracy in ns.
  *
  * After quite a bit of churning around, we've settled on
  * a simple thing of taking 0.1% of the timeout as the
@@ -49,22 +49,17 @@
 
 #define MAX_SLACK	(100 * NSEC_PER_MSEC)
 
-static long __estimate_accuracy(struct timespec64 *tv)
+static long __estimate_accuracy(ktime_t slack)
 {
-	long slack;
 	int divfactor = 1000;
 
-	if (tv->tv_sec < 0)
+	if (slack < 0)
 		return 0;
 
 	if (task_nice(current) > 0)
 		divfactor = divfactor / 5;
 
-	if (tv->tv_sec > MAX_SLACK / (NSEC_PER_SEC/divfactor))
-		return MAX_SLACK;
-
-	slack = tv->tv_nsec / divfactor;
-	slack += tv->tv_sec * (NSEC_PER_SEC/divfactor);
+	slack = ktime_divns(slack, divfactor);
 
 	if (slack > MAX_SLACK)
 		return MAX_SLACK;
@@ -72,27 +67,25 @@ static long __estimate_accuracy(struct timespec64 *tv)
 	return slack;
 }
 
-u64 select_estimate_accuracy(struct timespec64 *tv)
+u64 select_estimate_accuracy(struct timespec64 *timeout)
 {
-	u64 ret;
-	struct timespec64 now;
+	ktime_t now, slack;
 
 	/*
 	 * Realtime tasks get a slack of 0 for obvious reasons.
 	 */
-
 	if (rt_task(current))
 		return 0;
 
-	ktime_get_ts64(&now);
-	now = timespec64_sub(*tv, now);
-	ret = __estimate_accuracy(&now);
-	if (ret < current->timer_slack_ns)
-		return current->timer_slack_ns;
-	return ret;
-}
+	now = ktime_get();
+	slack = now - timespec64_to_ktime(*timeout);
 
+	slack = __estimate_accuracy(slack);
+	if (slack < current->timer_slack_ns)
+		return current->timer_slack_ns;
 
+	return slack;
+}
 
 struct poll_table_page {
 	struct poll_table_page * next;
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 4/9] select: Micro-optimise __estimate_accuracy()
  2019-09-09 10:23 [PATCH 0/9] restart_block: Prepare the ground for dumping timeout Dmitry Safonov
                   ` (2 preceding siblings ...)
  2019-09-09 10:23 ` [PATCH 3/9] select: Convert __esimate_accuracy() to ktime_t Dmitry Safonov
@ 2019-09-09 10:23 ` Dmitry Safonov
  2019-09-09 11:18   ` Cyrill Gorcunov
  2019-09-19 14:05   ` Cyrill Gorcunov
  2019-09-09 10:23 ` [PATCH 5/9] select: Convert select_estimate_accuracy() to take ktime_t Dmitry Safonov
                   ` (4 subsequent siblings)
  8 siblings, 2 replies; 19+ messages in thread
From: Dmitry Safonov @ 2019-09-09 10:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Dmitry Safonov, Dmitry Safonov, Adrian Reber, Alexander Viro,
	Andrei Vagin, Andy Lutomirski, Cyrill Gorcunov, Ingo Molnar,
	Oleg Nesterov, Pavel Emelyanov, Thomas Gleixner, containers,
	linux-fsdevel

Shift on s64 is faster than division, use it instead.

As the result of the patch there is a hardly user-visible effect:
poll(), select(), etc syscalls will be a bit more precise on ~2.3%
than before because 1000 != 1024 :)

Signed-off-by: Dmitry Safonov <dima@arista.com>
---
 fs/select.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/fs/select.c b/fs/select.c
index 12cdefd3be2d..2477c202631e 100644
--- a/fs/select.c
+++ b/fs/select.c
@@ -51,15 +51,14 @@
 
 static long __estimate_accuracy(ktime_t slack)
 {
-	int divfactor = 1000;
-
 	if (slack < 0)
 		return 0;
 
-	if (task_nice(current) > 0)
-		divfactor = divfactor / 5;
+	/* A bit more precise than 0.1% */
+	slack = slack >> 10;
 
-	slack = ktime_divns(slack, divfactor);
+	if (task_nice(current) > 0)
+		slack = slack * 5;
 
 	if (slack > MAX_SLACK)
 		return MAX_SLACK;
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 5/9] select: Convert select_estimate_accuracy() to take ktime_t
  2019-09-09 10:23 [PATCH 0/9] restart_block: Prepare the ground for dumping timeout Dmitry Safonov
                   ` (3 preceding siblings ...)
  2019-09-09 10:23 ` [PATCH 4/9] select: Micro-optimise __estimate_accuracy() Dmitry Safonov
@ 2019-09-09 10:23 ` Dmitry Safonov
  2019-09-09 10:23 ` [PATCH 6/9] select: Extract common code into do_sys_ppoll() Dmitry Safonov
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Dmitry Safonov @ 2019-09-09 10:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Dmitry Safonov, Dmitry Safonov, Adrian Reber, Alexander Viro,
	Andrei Vagin, Andy Lutomirski, Cyrill Gorcunov, Ingo Molnar,
	Oleg Nesterov, Pavel Emelyanov, Thomas Gleixner, containers,
	linux-fsdevel

Instead of converting the time on the first loop, the same
if (end_time) can be shared. Simplify the loop by taking time
conversion out.

Also prepare the ground for converting poll() restart_block timeout into
ktime_t - that's the only user that leaves it in timespec.
The conversion is needed to introduce an API for ptrace() to get
a timeout from restart_block.

Signed-off-by: Dmitry Safonov <dima@arista.com>
---
 fs/eventpoll.c       |  4 ++--
 fs/select.c          | 38 ++++++++++++--------------------------
 include/linux/poll.h |  2 +-
 3 files changed, 15 insertions(+), 29 deletions(-)

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index d7f1f5011fac..d5120fc49a39 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -1836,9 +1836,9 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
 	if (timeout > 0) {
 		struct timespec64 end_time = ep_set_mstimeout(timeout);
 
-		slack = select_estimate_accuracy(&end_time);
+		expires = timespec64_to_ktime(end_time);
 		to = &expires;
-		*to = timespec64_to_ktime(end_time);
+		slack = select_estimate_accuracy(expires);
 	} else if (timeout == 0) {
 		/*
 		 * Avoid the unnecessary trip to the wait queue loop, if the
diff --git a/fs/select.c b/fs/select.c
index 2477c202631e..458f2a944318 100644
--- a/fs/select.c
+++ b/fs/select.c
@@ -66,7 +66,7 @@ static long __estimate_accuracy(ktime_t slack)
 	return slack;
 }
 
-u64 select_estimate_accuracy(struct timespec64 *timeout)
+u64 select_estimate_accuracy(ktime_t timeout)
 {
 	ktime_t now, slack;
 
@@ -77,7 +77,7 @@ u64 select_estimate_accuracy(struct timespec64 *timeout)
 		return 0;
 
 	now = ktime_get();
-	slack = now - timespec64_to_ktime(*timeout);
+	slack = now - timeout;
 
 	slack = __estimate_accuracy(slack);
 	if (slack < current->timer_slack_ns)
@@ -490,8 +490,11 @@ static int do_select(int n, fd_set_bits *fds, struct timespec64 *end_time)
 		timed_out = 1;
 	}
 
-	if (end_time && !timed_out)
-		slack = select_estimate_accuracy(end_time);
+	if (end_time && !timed_out) {
+		expire = timespec64_to_ktime(*end_time);
+		to = &expire;
+		slack = select_estimate_accuracy(expire);
+	}
 
 	retval = 0;
 	for (;;) {
@@ -582,16 +585,6 @@ static int do_select(int n, fd_set_bits *fds, struct timespec64 *end_time)
 		}
 		busy_flag = 0;
 
-		/*
-		 * If this is the first loop and we have a timeout
-		 * given, then we convert to ktime_t and set the to
-		 * pointer to the expiry value.
-		 */
-		if (end_time && !to) {
-			expire = timespec64_to_ktime(*end_time);
-			to = &expire;
-		}
-
 		if (!poll_schedule_timeout(&table, TASK_INTERRUPTIBLE,
 					   to, slack))
 			timed_out = 1;
@@ -876,8 +869,11 @@ static int do_poll(struct poll_list *list, struct poll_wqueues *wait,
 		timed_out = 1;
 	}
 
-	if (end_time && !timed_out)
-		slack = select_estimate_accuracy(end_time);
+	if (end_time && !timed_out) {
+		expire = timespec64_to_ktime(*end_time);
+		to = &expire;
+		slack = select_estimate_accuracy(expire);
+	}
 
 	for (;;) {
 		struct poll_list *walk;
@@ -930,16 +926,6 @@ static int do_poll(struct poll_list *list, struct poll_wqueues *wait,
 		}
 		busy_flag = 0;
 
-		/*
-		 * If this is the first loop and we have a timeout
-		 * given, then we convert to ktime_t and set the to
-		 * pointer to the expiry value.
-		 */
-		if (end_time && !to) {
-			expire = timespec64_to_ktime(*end_time);
-			to = &expire;
-		}
-
 		if (!poll_schedule_timeout(wait, TASK_INTERRUPTIBLE, to, slack))
 			timed_out = 1;
 	}
diff --git a/include/linux/poll.h b/include/linux/poll.h
index 1cdc32b1f1b0..d0f21eb19257 100644
--- a/include/linux/poll.h
+++ b/include/linux/poll.h
@@ -112,7 +112,7 @@ struct poll_wqueues {
 
 extern void poll_initwait(struct poll_wqueues *pwq);
 extern void poll_freewait(struct poll_wqueues *pwq);
-extern u64 select_estimate_accuracy(struct timespec64 *tv);
+extern u64 select_estimate_accuracy(ktime_t timeout);
 
 #define MAX_INT64_SECONDS (((s64)(~((u64)0)>>1)/HZ)-1)
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 6/9] select: Extract common code into do_sys_ppoll()
  2019-09-09 10:23 [PATCH 0/9] restart_block: Prepare the ground for dumping timeout Dmitry Safonov
                   ` (4 preceding siblings ...)
  2019-09-09 10:23 ` [PATCH 5/9] select: Convert select_estimate_accuracy() to take ktime_t Dmitry Safonov
@ 2019-09-09 10:23 ` Dmitry Safonov
  2019-09-09 11:15   ` kbuild test robot
  2019-09-09 19:48   ` kbuild test robot
  2019-09-09 10:23 ` [PATCH 7/9] select: Use ktime_t in do_sys_poll() and do_poll() Dmitry Safonov
                   ` (2 subsequent siblings)
  8 siblings, 2 replies; 19+ messages in thread
From: Dmitry Safonov @ 2019-09-09 10:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Dmitry Safonov, Dmitry Safonov, Adrian Reber, Alexander Viro,
	Andrei Vagin, Andy Lutomirski, Cyrill Gorcunov, Ingo Molnar,
	Oleg Nesterov, Pavel Emelyanov, Thomas Gleixner, containers,
	linux-fsdevel

Reduce the amount of code and shrink a .text section a bit:
[linux]$ ./scripts/bloat-o-meter -t /tmp/vmlinux.o.{old,new}
add/remove: 1/0 grow/shrink: 0/4 up/down: 284/-691 (-407)
Function                                     old     new   delta
do_sys_ppoll                                   -     284    +284
__x64_sys_ppoll                              214      42    -172
__ia32_sys_ppoll                             213      40    -173
__ia32_compat_sys_ppoll_time64               213      40    -173
__ia32_compat_sys_ppoll_time32               213      40    -173
Total: Before=13357557, After=13357150, chg -0.00%

The downside is that "tsp" and "sigmask" parameters gets (void *),
but it seems worth losing static type checking if there is only one
line in syscall definition.
Other way could be to add compat parameters in do_sys_ppoll(), but
that trashes 2 more registers..

Signed-off-by: Dmitry Safonov <dima@arista.com>
---
 fs/select.c | 94 ++++++++++++++++++-----------------------------------
 1 file changed, 32 insertions(+), 62 deletions(-)

diff --git a/fs/select.c b/fs/select.c
index 458f2a944318..262300e58370 100644
--- a/fs/select.c
+++ b/fs/select.c
@@ -1056,54 +1056,58 @@ SYSCALL_DEFINE3(poll, struct pollfd __user *, ufds, unsigned int, nfds,
 	return ret;
 }
 
-SYSCALL_DEFINE5(ppoll, struct pollfd __user *, ufds, unsigned int, nfds,
-		struct __kernel_timespec __user *, tsp, const sigset_t __user *, sigmask,
-		size_t, sigsetsize)
+static int do_sys_ppoll(struct pollfd __user *ufds, unsigned int nfds,
+			void __user *tsp, const void __user *sigmask,
+			size_t sigsetsize, enum poll_time_type pt_type)
 {
 	struct timespec64 ts, end_time, *to = NULL;
 	int ret;
 
 	if (tsp) {
-		if (get_timespec64(&ts, tsp))
-			return -EFAULT;
+		switch (pt_type) {
+		case PT_TIMESPEC:
+			if (get_timespec64(&ts, tsp))
+				return -EFAULT;
+			break;
+		case PT_OLD_TIMESPEC:
+			if (get_old_timespec32(&ts, tsp))
+				return -EFAULT;
+			break;
+		default:
+			WARN_ON_ONCE(1);
+			return -ENOSYS;
+		}
 
 		to = &end_time;
 		if (poll_select_set_timeout(to, ts.tv_sec, ts.tv_nsec))
 			return -EINVAL;
 	}
 
-	ret = set_user_sigmask(sigmask, sigsetsize);
+	if (!in_compat_syscall())
+		ret = set_user_sigmask(sigmask, sigsetsize);
+	else
+		ret = set_compat_user_sigmask(sigmask, sigsetsize);
+
 	if (ret)
 		return ret;
 
 	ret = do_sys_poll(ufds, nfds, to);
-	return poll_select_finish(&end_time, tsp, PT_TIMESPEC, ret);
+	return poll_select_finish(&end_time, tsp, pt_type, ret);
 }
 
-#if defined(CONFIG_COMPAT_32BIT_TIME) && !defined(CONFIG_64BIT)
+SYSCALL_DEFINE5(ppoll, struct pollfd __user *, ufds, unsigned int, nfds,
+		struct __kernel_timespec __user *, tsp, const sigset_t __user *, sigmask,
+		size_t, sigsetsize)
+{
+	return do_sys_ppoll(ufds, nfds, tsp, sigmask, sigsetsize, PT_TIMESPEC);
+}
 
+#if defined(CONFIG_COMPAT_32BIT_TIME) && !defined(CONFIG_64BIT)
 SYSCALL_DEFINE5(ppoll_time32, struct pollfd __user *, ufds, unsigned int, nfds,
 		struct old_timespec32 __user *, tsp, const sigset_t __user *, sigmask,
 		size_t, sigsetsize)
 {
-	struct timespec64 ts, end_time, *to = NULL;
-	int ret;
-
-	if (tsp) {
-		if (get_old_timespec32(&ts, tsp))
-			return -EFAULT;
-
-		to = &end_time;
-		if (poll_select_set_timeout(to, ts.tv_sec, ts.tv_nsec))
-			return -EINVAL;
-	}
-
-	ret = set_user_sigmask(sigmask, sigsetsize);
-	if (ret)
-		return ret;
-
-	ret = do_sys_poll(ufds, nfds, to);
-	return poll_select_finish(&end_time, tsp, PT_OLD_TIMESPEC, ret);
+	return do_sys_ppoll(ufds, nfds, tsp, sigmask, sigsetsize, PT_OLD_TIMESPEC);
 }
 #endif
 
@@ -1352,24 +1356,7 @@ COMPAT_SYSCALL_DEFINE5(ppoll_time32, struct pollfd __user *, ufds,
 	unsigned int,  nfds, struct old_timespec32 __user *, tsp,
 	const compat_sigset_t __user *, sigmask, compat_size_t, sigsetsize)
 {
-	struct timespec64 ts, end_time, *to = NULL;
-	int ret;
-
-	if (tsp) {
-		if (get_old_timespec32(&ts, tsp))
-			return -EFAULT;
-
-		to = &end_time;
-		if (poll_select_set_timeout(to, ts.tv_sec, ts.tv_nsec))
-			return -EINVAL;
-	}
-
-	ret = set_compat_user_sigmask(sigmask, sigsetsize);
-	if (ret)
-		return ret;
-
-	ret = do_sys_poll(ufds, nfds, to);
-	return poll_select_finish(&end_time, tsp, PT_OLD_TIMESPEC, ret);
+	return do_sys_ppoll(ufds, nfds, tsp, sigmask, sigsetsize, PT_OLD_TIMESPEC);
 }
 #endif
 
@@ -1378,24 +1365,7 @@ COMPAT_SYSCALL_DEFINE5(ppoll_time64, struct pollfd __user *, ufds,
 	unsigned int,  nfds, struct __kernel_timespec __user *, tsp,
 	const compat_sigset_t __user *, sigmask, compat_size_t, sigsetsize)
 {
-	struct timespec64 ts, end_time, *to = NULL;
-	int ret;
-
-	if (tsp) {
-		if (get_timespec64(&ts, tsp))
-			return -EFAULT;
-
-		to = &end_time;
-		if (poll_select_set_timeout(to, ts.tv_sec, ts.tv_nsec))
-			return -EINVAL;
-	}
-
-	ret = set_compat_user_sigmask(sigmask, sigsetsize);
-	if (ret)
-		return ret;
-
-	ret = do_sys_poll(ufds, nfds, to);
-	return poll_select_finish(&end_time, tsp, PT_TIMESPEC, ret);
+	return do_sys_ppoll(ufds, nfds, tsp, sigmask, sigsetsize, PT_TIMESPEC);
 }
 
 #endif
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 7/9] select: Use ktime_t in do_sys_poll() and do_poll()
  2019-09-09 10:23 [PATCH 0/9] restart_block: Prepare the ground for dumping timeout Dmitry Safonov
                   ` (5 preceding siblings ...)
  2019-09-09 10:23 ` [PATCH 6/9] select: Extract common code into do_sys_ppoll() Dmitry Safonov
@ 2019-09-09 10:23 ` Dmitry Safonov
  2019-09-09 10:23 ` [PATCH 8/9] select/restart_block: Convert poll's timeout to u64 Dmitry Safonov
  2019-09-09 10:23 ` [PATCH 9/9] restart_block: Make common timeout Dmitry Safonov
  8 siblings, 0 replies; 19+ messages in thread
From: Dmitry Safonov @ 2019-09-09 10:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Dmitry Safonov, Dmitry Safonov, Adrian Reber, Alexander Viro,
	Andrei Vagin, Andy Lutomirski, Cyrill Gorcunov, Ingo Molnar,
	Oleg Nesterov, Pavel Emelyanov, Thomas Gleixner, containers,
	linux-fsdevel

The plan is to store what's left of timeout in restart block as ktime_t
which will be used for futex() and nanosleep() timeouts too. That will
be a value to return with a new ptrace() request API.

Convert end_time argument of do_{sys_,}poll() functions to ktime_t as
a preparation ground for storing ktime_t inside restart_block.

Signed-off-by: Dmitry Safonov <dima@arista.com>
---
 fs/select.c | 47 +++++++++++++++++++++++------------------------
 1 file changed, 23 insertions(+), 24 deletions(-)

diff --git a/fs/select.c b/fs/select.c
index 262300e58370..4af88feaa2fe 100644
--- a/fs/select.c
+++ b/fs/select.c
@@ -854,25 +854,22 @@ static inline __poll_t do_pollfd(struct pollfd *pollfd, poll_table *pwait,
 }
 
 static int do_poll(struct poll_list *list, struct poll_wqueues *wait,
-		   struct timespec64 *end_time)
+		   ktime_t end_time)
 {
 	poll_table* pt = &wait->pt;
-	ktime_t expire, *to = NULL;
+	ktime_t *to = NULL;
 	int timed_out = 0, count = 0;
 	u64 slack = 0;
 	__poll_t busy_flag = net_busy_loop_on() ? POLL_BUSY_LOOP : 0;
 	unsigned long busy_start = 0;
 
 	/* Optimise the no-wait case */
-	if (end_time && !end_time->tv_sec && !end_time->tv_nsec) {
+	if (ktime_compare(ktime_get(), end_time) >= 0) {
 		pt->_qproc = NULL;
 		timed_out = 1;
-	}
-
-	if (end_time && !timed_out) {
-		expire = timespec64_to_ktime(*end_time);
-		to = &expire;
-		slack = select_estimate_accuracy(expire);
+	} else {
+		to = &end_time;
+		slack = select_estimate_accuracy(end_time);
 	}
 
 	for (;;) {
@@ -936,7 +933,7 @@ static int do_poll(struct poll_list *list, struct poll_wqueues *wait,
 			sizeof(struct pollfd))
 
 static int do_sys_poll(struct pollfd __user *ufds, unsigned int nfds,
-		struct timespec64 *end_time)
+		       ktime_t end_time)
 {
 	struct poll_wqueues table;
 	int err = -EFAULT, fdcount, len;
@@ -1004,16 +1001,15 @@ static long do_restart_poll(struct restart_block *restart_block)
 {
 	struct pollfd __user *ufds = restart_block->poll.ufds;
 	int nfds = restart_block->poll.nfds;
-	struct timespec64 *to = NULL, end_time;
+	ktime_t timeout = 0;
 	int ret;
 
 	if (restart_block->poll.has_timeout) {
-		end_time.tv_sec = restart_block->poll.tv_sec;
-		end_time.tv_nsec = restart_block->poll.tv_nsec;
-		to = &end_time;
+		timeout = ktime_set(restart_block->poll.tv_sec,
+				    restart_block->poll.tv_nsec);
 	}
 
-	ret = do_sys_poll(ufds, nfds, to);
+	ret = do_sys_poll(ufds, nfds, timeout);
 
 	if (ret == -ERESTARTNOHAND) {
 		restart_block->fn = do_restart_poll;
@@ -1025,16 +1021,17 @@ static long do_restart_poll(struct restart_block *restart_block)
 SYSCALL_DEFINE3(poll, struct pollfd __user *, ufds, unsigned int, nfds,
 		int, timeout_msecs)
 {
-	struct timespec64 end_time, *to = NULL;
+	struct timespec64 end_time;
+	ktime_t timeout = 0;
 	int ret;
 
 	if (timeout_msecs >= 0) {
-		to = &end_time;
-		poll_select_set_timeout(to, timeout_msecs / MSEC_PER_SEC,
+		poll_select_set_timeout(&end_time, timeout_msecs / MSEC_PER_SEC,
 			NSEC_PER_MSEC * (timeout_msecs % MSEC_PER_SEC));
+		timeout = timespec64_to_ktime(end_time);
 	}
 
-	ret = do_sys_poll(ufds, nfds, to);
+	ret = do_sys_poll(ufds, nfds, timeout);
 
 	if (ret == -ERESTARTNOHAND) {
 		struct restart_block *restart_block;
@@ -1060,7 +1057,8 @@ static int do_sys_ppoll(struct pollfd __user *ufds, unsigned int nfds,
 			void __user *tsp, const void __user *sigmask,
 			size_t sigsetsize, enum poll_time_type pt_type)
 {
-	struct timespec64 ts, end_time, *to = NULL;
+	struct timespec64 ts, *to = NULL;
+	ktime_t timeout = 0;
 	int ret;
 
 	if (tsp) {
@@ -1078,9 +1076,10 @@ static int do_sys_ppoll(struct pollfd __user *ufds, unsigned int nfds,
 			return -ENOSYS;
 		}
 
-		to = &end_time;
-		if (poll_select_set_timeout(to, ts.tv_sec, ts.tv_nsec))
+		to = &ts;
+		if (poll_select_set_timeout(&ts, ts.tv_sec, ts.tv_nsec))
 			return -EINVAL;
+		timeout = timespec64_to_ktime(ts);
 	}
 
 	if (!in_compat_syscall())
@@ -1091,8 +1090,8 @@ static int do_sys_ppoll(struct pollfd __user *ufds, unsigned int nfds,
 	if (ret)
 		return ret;
 
-	ret = do_sys_poll(ufds, nfds, to);
-	return poll_select_finish(&end_time, tsp, pt_type, ret);
+	ret = do_sys_poll(ufds, nfds, timeout);
+	return poll_select_finish(to, tsp, pt_type, ret);
 }
 
 SYSCALL_DEFINE5(ppoll, struct pollfd __user *, ufds, unsigned int, nfds,
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 8/9] select/restart_block: Convert poll's timeout to u64
  2019-09-09 10:23 [PATCH 0/9] restart_block: Prepare the ground for dumping timeout Dmitry Safonov
                   ` (6 preceding siblings ...)
  2019-09-09 10:23 ` [PATCH 7/9] select: Use ktime_t in do_sys_poll() and do_poll() Dmitry Safonov
@ 2019-09-09 10:23 ` Dmitry Safonov
  2019-09-09 13:07   ` David Laight
  2019-09-09 10:23 ` [PATCH 9/9] restart_block: Make common timeout Dmitry Safonov
  8 siblings, 1 reply; 19+ messages in thread
From: Dmitry Safonov @ 2019-09-09 10:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Dmitry Safonov, Dmitry Safonov, Adrian Reber, Alexander Viro,
	Andrei Vagin, Andy Lutomirski, Cyrill Gorcunov, Ingo Molnar,
	Oleg Nesterov, Pavel Emelyanov, Thomas Gleixner, containers,
	linux-fsdevel

All preparations have been done - now poll() can set u64 timeout in
restart_block. It allows to do the next step - unifying all timeouts in
restart_block and provide ptrace() API to read it.

Signed-off-by: Dmitry Safonov <dima@arista.com>
---
 fs/select.c                   | 27 +++++++--------------------
 include/linux/restart_block.h |  4 +---
 2 files changed, 8 insertions(+), 23 deletions(-)

diff --git a/fs/select.c b/fs/select.c
index 4af88feaa2fe..ff2b9c4865cd 100644
--- a/fs/select.c
+++ b/fs/select.c
@@ -1001,14 +1001,9 @@ static long do_restart_poll(struct restart_block *restart_block)
 {
 	struct pollfd __user *ufds = restart_block->poll.ufds;
 	int nfds = restart_block->poll.nfds;
-	ktime_t timeout = 0;
+	ktime_t timeout = restart_block->poll.timeout;
 	int ret;
 
-	if (restart_block->poll.has_timeout) {
-		timeout = ktime_set(restart_block->poll.tv_sec,
-				    restart_block->poll.tv_nsec);
-	}
-
 	ret = do_sys_poll(ufds, nfds, timeout);
 
 	if (ret == -ERESTARTNOHAND) {
@@ -1021,14 +1016,12 @@ static long do_restart_poll(struct restart_block *restart_block)
 SYSCALL_DEFINE3(poll, struct pollfd __user *, ufds, unsigned int, nfds,
 		int, timeout_msecs)
 {
-	struct timespec64 end_time;
 	ktime_t timeout = 0;
 	int ret;
 
 	if (timeout_msecs >= 0) {
-		poll_select_set_timeout(&end_time, timeout_msecs / MSEC_PER_SEC,
-			NSEC_PER_MSEC * (timeout_msecs % MSEC_PER_SEC));
-		timeout = timespec64_to_ktime(end_time);
+		timeout = ktime_add_ms(0, timeout_msecs);
+		timeout = ktime_add_safe(ktime_get(), timeout);
 	}
 
 	ret = do_sys_poll(ufds, nfds, timeout);
@@ -1037,16 +1030,10 @@ SYSCALL_DEFINE3(poll, struct pollfd __user *, ufds, unsigned int, nfds,
 		struct restart_block *restart_block;
 
 		restart_block = &current->restart_block;
-		restart_block->fn = do_restart_poll;
-		restart_block->poll.ufds = ufds;
-		restart_block->poll.nfds = nfds;
-
-		if (timeout_msecs >= 0) {
-			restart_block->poll.tv_sec = end_time.tv_sec;
-			restart_block->poll.tv_nsec = end_time.tv_nsec;
-			restart_block->poll.has_timeout = 1;
-		} else
-			restart_block->poll.has_timeout = 0;
+		restart_block->fn		= do_restart_poll;
+		restart_block->poll.ufds	= ufds;
+		restart_block->poll.nfds	= nfds;
+		restart_block->poll.timeout	= timeout;
 
 		ret = -ERESTART_RESTARTBLOCK;
 	}
diff --git a/include/linux/restart_block.h b/include/linux/restart_block.h
index e66e982105f4..63d647b65395 100644
--- a/include/linux/restart_block.h
+++ b/include/linux/restart_block.h
@@ -49,11 +49,9 @@ struct restart_block {
 		} nanosleep;
 		/* For poll */
 		struct {
+			u64 timeout;
 			struct pollfd __user *ufds;
 			int nfds;
-			int has_timeout;
-			unsigned long tv_sec;
-			unsigned long tv_nsec;
 		} poll;
 	};
 };
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 9/9] restart_block: Make common timeout
  2019-09-09 10:23 [PATCH 0/9] restart_block: Prepare the ground for dumping timeout Dmitry Safonov
                   ` (7 preceding siblings ...)
  2019-09-09 10:23 ` [PATCH 8/9] select/restart_block: Convert poll's timeout to u64 Dmitry Safonov
@ 2019-09-09 10:23 ` Dmitry Safonov
  8 siblings, 0 replies; 19+ messages in thread
From: Dmitry Safonov @ 2019-09-09 10:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Dmitry Safonov, Dmitry Safonov, Adrian Reber, Alexander Viro,
	Andrei Vagin, Andy Lutomirski, Cyrill Gorcunov, Ingo Molnar,
	Oleg Nesterov, Pavel Emelyanov, Thomas Gleixner, containers,
	linux-fsdevel

In order to provide a unified API to get the leftover of timeout,
the timeout for different users of restart_block can be joined.
All preparations done, so move timeout out of union and convert
the users.

Signed-off-by: Dmitry Safonov <dima@arista.com>
---
 fs/select.c                    | 10 +++++-----
 include/linux/restart_block.h  |  4 +---
 kernel/futex.c                 | 14 +++++++-------
 kernel/time/alarmtimer.c       |  6 +++---
 kernel/time/hrtimer.c          |  6 +++---
 kernel/time/posix-cpu-timers.c |  6 +++---
 6 files changed, 22 insertions(+), 24 deletions(-)

diff --git a/fs/select.c b/fs/select.c
index ff2b9c4865cd..9ab6fc6fb7c5 100644
--- a/fs/select.c
+++ b/fs/select.c
@@ -1001,7 +1001,7 @@ static long do_restart_poll(struct restart_block *restart_block)
 {
 	struct pollfd __user *ufds = restart_block->poll.ufds;
 	int nfds = restart_block->poll.nfds;
-	ktime_t timeout = restart_block->poll.timeout;
+	ktime_t timeout = restart_block->timeout;
 	int ret;
 
 	ret = do_sys_poll(ufds, nfds, timeout);
@@ -1030,10 +1030,10 @@ SYSCALL_DEFINE3(poll, struct pollfd __user *, ufds, unsigned int, nfds,
 		struct restart_block *restart_block;
 
 		restart_block = &current->restart_block;
-		restart_block->fn		= do_restart_poll;
-		restart_block->poll.ufds	= ufds;
-		restart_block->poll.nfds	= nfds;
-		restart_block->poll.timeout	= timeout;
+		restart_block->fn	 = do_restart_poll;
+		restart_block->poll.ufds = ufds;
+		restart_block->poll.nfds = nfds;
+		restart_block->timeout	 = timeout;
 
 		ret = -ERESTART_RESTARTBLOCK;
 	}
diff --git a/include/linux/restart_block.h b/include/linux/restart_block.h
index 63d647b65395..02f90ab00a2d 100644
--- a/include/linux/restart_block.h
+++ b/include/linux/restart_block.h
@@ -27,6 +27,7 @@ enum timespec_type {
  * userspace tricks in the union.
  */
 struct restart_block {
+	s64 timeout;
 	long (*fn)(struct restart_block *);
 	union {
 		/* For futex_wait and futex_wait_requeue_pi */
@@ -35,7 +36,6 @@ struct restart_block {
 			u32 val;
 			u32 flags;
 			u32 bitset;
-			u64 time;
 		} futex;
 		/* For nanosleep */
 		struct {
@@ -45,11 +45,9 @@ struct restart_block {
 				struct __kernel_timespec __user *rmtp;
 				struct old_timespec32 __user *compat_rmtp;
 			};
-			u64 expires;
 		} nanosleep;
 		/* For poll */
 		struct {
-			u64 timeout;
 			struct pollfd __user *ufds;
 			int nfds;
 		} poll;
diff --git a/kernel/futex.c b/kernel/futex.c
index 6d50728ef2e7..0738167e4911 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -2755,12 +2755,12 @@ static int futex_wait(u32 __user *uaddr, unsigned int flags, u32 val,
 		goto out;
 
 	restart = &current->restart_block;
-	restart->fn = futex_wait_restart;
-	restart->futex.uaddr = uaddr;
-	restart->futex.val = val;
-	restart->futex.time = *abs_time;
-	restart->futex.bitset = bitset;
-	restart->futex.flags = flags | FLAGS_HAS_TIMEOUT;
+	restart->fn		= futex_wait_restart;
+	restart->futex.uaddr	= uaddr;
+	restart->futex.val	= val;
+	restart->timeout	= *abs_time;
+	restart->futex.bitset	= bitset;
+	restart->futex.flags	= flags | FLAGS_HAS_TIMEOUT;
 
 	ret = -ERESTART_RESTARTBLOCK;
 
@@ -2779,7 +2779,7 @@ static long futex_wait_restart(struct restart_block *restart)
 	ktime_t t, *tp = NULL;
 
 	if (restart->futex.flags & FLAGS_HAS_TIMEOUT) {
-		t = restart->futex.time;
+		t = restart->timeout;
 		tp = &t;
 	}
 	restart->fn = do_no_restart_syscall;
diff --git a/kernel/time/alarmtimer.c b/kernel/time/alarmtimer.c
index 57518efc3810..148b187c371e 100644
--- a/kernel/time/alarmtimer.c
+++ b/kernel/time/alarmtimer.c
@@ -763,7 +763,7 @@ alarm_init_on_stack(struct alarm *alarm, enum alarmtimer_type type,
 static long __sched alarm_timer_nsleep_restart(struct restart_block *restart)
 {
 	enum  alarmtimer_type type = restart->nanosleep.clockid;
-	ktime_t exp = restart->nanosleep.expires;
+	ktime_t exp = restart->timeout;
 	struct alarm alarm;
 
 	alarm_init_on_stack(&alarm, type, alarmtimer_nsleep_wakeup);
@@ -816,9 +816,9 @@ static int alarm_timer_nsleep(const clockid_t which_clock, int flags,
 	if (flags == TIMER_ABSTIME)
 		return -ERESTARTNOHAND;
 
-	restart->fn = alarm_timer_nsleep_restart;
+	restart->fn		   = alarm_timer_nsleep_restart;
 	restart->nanosleep.clockid = type;
-	restart->nanosleep.expires = exp;
+	restart->timeout	   = exp;
 	return ret;
 }
 
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 4ba2b50d068f..18d4b0cc919c 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -1709,7 +1709,7 @@ static long __sched hrtimer_nanosleep_restart(struct restart_block *restart)
 
 	hrtimer_init_on_stack(&t.timer, restart->nanosleep.clockid,
 				HRTIMER_MODE_ABS);
-	hrtimer_set_expires_tv64(&t.timer, restart->nanosleep.expires);
+	hrtimer_set_expires_tv64(&t.timer, restart->timeout);
 
 	ret = do_nanosleep(&t, HRTIMER_MODE_ABS);
 	destroy_hrtimer_on_stack(&t.timer);
@@ -1741,9 +1741,9 @@ long hrtimer_nanosleep(const struct timespec64 *rqtp,
 	}
 
 	restart = &current->restart_block;
-	restart->fn = hrtimer_nanosleep_restart;
+	restart->fn		   = hrtimer_nanosleep_restart;
 	restart->nanosleep.clockid = t.timer.base->clockid;
-	restart->nanosleep.expires = hrtimer_get_expires_tv64(&t.timer);
+	restart->timeout	   = hrtimer_get_expires_tv64(&t.timer);
 out:
 	destroy_hrtimer_on_stack(&t.timer);
 	return ret;
diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c
index b4dddf74dd15..691de00107c2 100644
--- a/kernel/time/posix-cpu-timers.c
+++ b/kernel/time/posix-cpu-timers.c
@@ -1332,8 +1332,8 @@ static int do_cpu_nanosleep(const clockid_t which_clock, int flags,
 		 * Report back to the user the time still remaining.
 		 */
 		restart = &current->restart_block;
-		restart->fn = posix_cpu_nsleep_restart;
-		restart->nanosleep.expires = expires;
+		restart->fn	 = posix_cpu_nsleep_restart;
+		restart->timeout = expires;
 		if (restart->nanosleep.type != TT_NONE)
 			error = nanosleep_copyout(restart, &it.it_value);
 	}
@@ -1372,7 +1372,7 @@ static long posix_cpu_nsleep_restart(struct restart_block *restart_block)
 	clockid_t which_clock = restart_block->nanosleep.clockid;
 	struct timespec64 t;
 
-	t = ns_to_timespec64(restart_block->nanosleep.expires);
+	t = ns_to_timespec64(restart_block->timeout);
 
 	return do_cpu_nanosleep(which_clock, TIMER_ABSTIME, &t);
 }
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 6/9] select: Extract common code into do_sys_ppoll()
  2019-09-09 10:23 ` [PATCH 6/9] select: Extract common code into do_sys_ppoll() Dmitry Safonov
@ 2019-09-09 11:15   ` kbuild test robot
  2019-09-09 19:48   ` kbuild test robot
  1 sibling, 0 replies; 19+ messages in thread
From: kbuild test robot @ 2019-09-09 11:15 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: kbuild-all, linux-kernel, Dmitry Safonov, Dmitry Safonov,
	Adrian Reber, Alexander Viro, Andrei Vagin, Andy Lutomirski,
	Cyrill Gorcunov, Ingo Molnar, Oleg Nesterov, Pavel Emelyanov,
	Thomas Gleixner, containers, linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 2379 bytes --]

Hi Dmitry,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[cannot apply to v5.3-rc8 next-20190904]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Dmitry-Safonov/restart_block-Prepare-the-ground-for-dumping-timeout/20190909-182945
config: i386-tinyconfig (attached as .config)
compiler: gcc-7 (Debian 7.4.0-11) 7.4.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   fs/select.c: In function 'do_sys_ppoll':
>> fs/select.c:1089:9: error: implicit declaration of function 'set_compat_user_sigmask'; did you mean 'set_user_sigmask'? [-Werror=implicit-function-declaration]
      ret = set_compat_user_sigmask(sigmask, sigsetsize);
            ^~~~~~~~~~~~~~~~~~~~~~~
            set_user_sigmask
   cc1: some warnings being treated as errors

vim +1089 fs/select.c

  1058	
  1059	static int do_sys_ppoll(struct pollfd __user *ufds, unsigned int nfds,
  1060				void __user *tsp, const void __user *sigmask,
  1061				size_t sigsetsize, enum poll_time_type pt_type)
  1062	{
  1063		struct timespec64 ts, end_time, *to = NULL;
  1064		int ret;
  1065	
  1066		if (tsp) {
  1067			switch (pt_type) {
  1068			case PT_TIMESPEC:
  1069				if (get_timespec64(&ts, tsp))
  1070					return -EFAULT;
  1071				break;
  1072			case PT_OLD_TIMESPEC:
  1073				if (get_old_timespec32(&ts, tsp))
  1074					return -EFAULT;
  1075				break;
  1076			default:
  1077				WARN_ON_ONCE(1);
  1078				return -ENOSYS;
  1079			}
  1080	
  1081			to = &end_time;
  1082			if (poll_select_set_timeout(to, ts.tv_sec, ts.tv_nsec))
  1083				return -EINVAL;
  1084		}
  1085	
  1086		if (!in_compat_syscall())
  1087			ret = set_user_sigmask(sigmask, sigsetsize);
  1088		else
> 1089			ret = set_compat_user_sigmask(sigmask, sigsetsize);
  1090	
  1091		if (ret)
  1092			return ret;
  1093	
  1094		ret = do_sys_poll(ufds, nfds, to);
  1095		return poll_select_finish(&end_time, tsp, pt_type, ret);
  1096	}
  1097	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 7184 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/9] select: Micro-optimise __estimate_accuracy()
  2019-09-09 10:23 ` [PATCH 4/9] select: Micro-optimise __estimate_accuracy() Dmitry Safonov
@ 2019-09-09 11:18   ` Cyrill Gorcunov
  2019-09-09 11:50     ` Dmitry Safonov
  2019-09-19 14:05   ` Cyrill Gorcunov
  1 sibling, 1 reply; 19+ messages in thread
From: Cyrill Gorcunov @ 2019-09-09 11:18 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: linux-kernel, Dmitry Safonov, Adrian Reber, Alexander Viro,
	Andrei Vagin, Andy Lutomirski, Ingo Molnar, Oleg Nesterov,
	Pavel Emelyanov, Thomas Gleixner, containers, linux-fsdevel

On Mon, Sep 09, 2019 at 11:23:35AM +0100, Dmitry Safonov wrote:
> Shift on s64 is faster than division, use it instead.
> 
> As the result of the patch there is a hardly user-visible effect:
> poll(), select(), etc syscalls will be a bit more precise on ~2.3%
> than before because 1000 != 1024 :)
> 
> Signed-off-by: Dmitry Safonov <dima@arista.com>

> ---
>  fs/select.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/select.c b/fs/select.c
> index 12cdefd3be2d..2477c202631e 100644
> --- a/fs/select.c
> +++ b/fs/select.c
> @@ -51,15 +51,14 @@
>  
>  static long __estimate_accuracy(ktime_t slack)
>  {
> -	int divfactor = 1000;
> -
>  	if (slack < 0)
>  		return 0;
>  
> -	if (task_nice(current) > 0)
> -		divfactor = divfactor / 5;
> +	/* A bit more precise than 0.1% */
> +	slack = slack >> 10;
>  
> -	slack = ktime_divns(slack, divfactor);
> +	if (task_nice(current) > 0)
> +		slack = slack * 5;
>  
>  	if (slack > MAX_SLACK)
>  		return MAX_SLACK;

Compiler precompute constants so it doesn't do division here.
But I didn't read the series yet so I might be missing
something obvious.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/9] select: Micro-optimise __estimate_accuracy()
  2019-09-09 11:18   ` Cyrill Gorcunov
@ 2019-09-09 11:50     ` Dmitry Safonov
  2019-09-09 12:14       ` Cyrill Gorcunov
  0 siblings, 1 reply; 19+ messages in thread
From: Dmitry Safonov @ 2019-09-09 11:50 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Dmitry Safonov, open list, Adrian Reber, Alexander Viro,
	Andrei Vagin, Andy Lutomirski, Ingo Molnar, Oleg Nesterov,
	Pavel Emelyanov, Thomas Gleixner, Linux Containers,
	linux-fsdevel

Hi Cyrill,

On Mon, 9 Sep 2019 at 12:18, Cyrill Gorcunov <gorcunov@gmail.com> wrote:
> Compiler precompute constants so it doesn't do division here.
> But I didn't read the series yet so I might be missing
> something obvious.

Heh, like a division is in ktime_divns()?

Thanks,
             Dmitry

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/9] select: Micro-optimise __estimate_accuracy()
  2019-09-09 11:50     ` Dmitry Safonov
@ 2019-09-09 12:14       ` Cyrill Gorcunov
  0 siblings, 0 replies; 19+ messages in thread
From: Cyrill Gorcunov @ 2019-09-09 12:14 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: Dmitry Safonov, open list, Adrian Reber, Alexander Viro,
	Andrei Vagin, Andy Lutomirski, Ingo Molnar, Oleg Nesterov,
	Pavel Emelyanov, Thomas Gleixner, Linux Containers,
	linux-fsdevel

On Mon, Sep 09, 2019 at 12:50:27PM +0100, Dmitry Safonov wrote:
> Hi Cyrill,
> 
> On Mon, 9 Sep 2019 at 12:18, Cyrill Gorcunov <gorcunov@gmail.com> wrote:
> > Compiler precompute constants so it doesn't do division here.
> > But I didn't read the series yet so I might be missing
> > something obvious.
> 
> Heh, like a division is in ktime_divns()?

Ah, you meant the ktime_divns you've dropped out. I thought
you were talking about the constant value we've had here before
your patch. Seems I didn't got the changelog right. Anyway
need to take more precise look on the series. Hopefully soon.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH 8/9] select/restart_block: Convert poll's timeout to u64
  2019-09-09 10:23 ` [PATCH 8/9] select/restart_block: Convert poll's timeout to u64 Dmitry Safonov
@ 2019-09-09 13:07   ` David Laight
  2019-09-16 15:19     ` Dmitry Safonov
  0 siblings, 1 reply; 19+ messages in thread
From: David Laight @ 2019-09-09 13:07 UTC (permalink / raw)
  To: 'Dmitry Safonov', linux-kernel
  Cc: Dmitry Safonov, Adrian Reber, Alexander Viro, Andrei Vagin,
	Andy Lutomirski, Cyrill Gorcunov, Ingo Molnar, Oleg Nesterov,
	Pavel Emelyanov, Thomas Gleixner, containers, linux-fsdevel

From: Dmitry Safonov
> Sent: 09 September 2019 11:24
> 
> All preparations have been done - now poll() can set u64 timeout in
> restart_block. It allows to do the next step - unifying all timeouts in
> restart_block and provide ptrace() API to read it.
> 
> Signed-off-by: Dmitry Safonov <dima@arista.com>
> ---
>  fs/select.c                   | 27 +++++++--------------------
>  include/linux/restart_block.h |  4 +---
>  2 files changed, 8 insertions(+), 23 deletions(-)
> 
> diff --git a/fs/select.c b/fs/select.c
> index 4af88feaa2fe..ff2b9c4865cd 100644
> --- a/fs/select.c
> +++ b/fs/select.c
...
> @@ -1037,16 +1030,10 @@ SYSCALL_DEFINE3(poll, struct pollfd __user *, ufds, unsigned int, nfds,
>  		struct restart_block *restart_block;
> 
>  		restart_block = &current->restart_block;
> -		restart_block->fn = do_restart_poll;
> -		restart_block->poll.ufds = ufds;
> -		restart_block->poll.nfds = nfds;
> -
> -		if (timeout_msecs >= 0) {
> -			restart_block->poll.tv_sec = end_time.tv_sec;
> -			restart_block->poll.tv_nsec = end_time.tv_nsec;
> -			restart_block->poll.has_timeout = 1;
> -		} else
> -			restart_block->poll.has_timeout = 0;
> +		restart_block->fn		= do_restart_poll;
> +		restart_block->poll.ufds	= ufds;
> +		restart_block->poll.nfds	= nfds;
> +		restart_block->poll.timeout	= timeout;

What is all that whitespace for?

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 6/9] select: Extract common code into do_sys_ppoll()
  2019-09-09 10:23 ` [PATCH 6/9] select: Extract common code into do_sys_ppoll() Dmitry Safonov
  2019-09-09 11:15   ` kbuild test robot
@ 2019-09-09 19:48   ` kbuild test robot
  1 sibling, 0 replies; 19+ messages in thread
From: kbuild test robot @ 2019-09-09 19:48 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: kbuild-all, linux-kernel, Dmitry Safonov, Dmitry Safonov,
	Adrian Reber, Alexander Viro, Andrei Vagin, Andy Lutomirski,
	Cyrill Gorcunov, Ingo Molnar, Oleg Nesterov, Pavel Emelyanov,
	Thomas Gleixner, containers, linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 8620 bytes --]

Hi Dmitry,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[cannot apply to v5.3-rc8 next-20190904]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Dmitry-Safonov/restart_block-Prepare-the-ground-for-dumping-timeout/20190909-182945
config: i386-randconfig-a003-201936 (attached as .config)
compiler: gcc-4.9 (Debian 4.9.2-10+deb8u1) 4.9.2
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   fs/select.c: In function 'do_sys_ppoll':
>> fs/select.c:1089:3: error: implicit declaration of function 'set_compat_user_sigmask' [-Werror=implicit-function-declaration]
      ret = set_compat_user_sigmask(sigmask, sigsetsize);
      ^
   Cyclomatic Complexity 5 include/linux/compiler.h:__read_once_size
   Cyclomatic Complexity 5 include/linux/compiler.h:__write_once_size
   Cyclomatic Complexity 1 include/linux/kasan-checks.h:kasan_check_read
   Cyclomatic Complexity 1 include/linux/kasan-checks.h:kasan_check_write
   Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:constant_test_bit
   Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:variable_test_bit
   Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:fls
   Cyclomatic Complexity 2 include/asm-generic/bitops-instrumented.h:test_bit
   Cyclomatic Complexity 1 include/linux/log2.h:__ilog2_u32
   Cyclomatic Complexity 1 arch/x86/include/asm/atomic.h:arch_atomic_inc
   Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic_inc
   Cyclomatic Complexity 1 include/asm-generic/atomic-long.h:atomic_long_inc
   Cyclomatic Complexity 1 include/linux/time64.h:timespec64_sub
   Cyclomatic Complexity 3 include/linux/time64.h:timespec64_valid
   Cyclomatic Complexity 1 arch/x86/include/asm/current.h:get_current
   Cyclomatic Complexity 68 include/asm-generic/getorder.h:get_order
   Cyclomatic Complexity 1 include/linux/thread_info.h:test_ti_thread_flag
   Cyclomatic Complexity 1 include/linux/thread_info.h:check_object_size
   Cyclomatic Complexity 2 include/linux/thread_info.h:copy_overflow
   Cyclomatic Complexity 4 include/linux/thread_info.h:check_copy_size
   Cyclomatic Complexity 1 arch/x86/include/asm/preempt.h:preempt_count
   Cyclomatic Complexity 5 arch/x86/include/asm/preempt.h:__preempt_count_add
   Cyclomatic Complexity 5 arch/x86/include/asm/preempt.h:__preempt_count_sub
   Cyclomatic Complexity 1 include/linux/rcupdate.h:__rcu_read_lock
   Cyclomatic Complexity 1 include/linux/rcupdate.h:__rcu_read_unlock
   Cyclomatic Complexity 2 include/linux/ktime.h:ktime_set
   Cyclomatic Complexity 1 include/linux/ktime.h:timespec64_to_ktime
   Cyclomatic Complexity 1 include/linux/rcupdate.h:rcu_lock_acquire
   Cyclomatic Complexity 1 include/linux/rcupdate.h:rcu_lock_release
   Cyclomatic Complexity 1 include/linux/rcupdate.h:rcu_read_lock
   Cyclomatic Complexity 1 include/linux/rcupdate.h:rcu_read_unlock
   Cyclomatic Complexity 1 include/linux/wait.h:init_waitqueue_func_entry
   Cyclomatic Complexity 1 include/linux/sched.h:task_nice
   Cyclomatic Complexity 1 include/linux/sched.h:task_thread_info
   Cyclomatic Complexity 1 include/linux/sched.h:test_tsk_thread_flag
   Cyclomatic Complexity 1 include/linux/sched.h:need_resched
   Cyclomatic Complexity 1 arch/x86/include/asm/smap.h:clac
   Cyclomatic Complexity 1 arch/x86/include/asm/smap.h:stac
   Cyclomatic Complexity 3 arch/x86/include/asm/uaccess.h:__chk_range_not_ok
   Cyclomatic Complexity 1 arch/x86/include/asm/uaccess_32.h:raw_copy_to_user
   Cyclomatic Complexity 1 include/linux/uaccess.h:__copy_to_user
   Cyclomatic Complexity 2 include/linux/uaccess.h:copy_from_user
   Cyclomatic Complexity 2 include/linux/uaccess.h:copy_to_user
   Cyclomatic Complexity 1 include/linux/uaccess.h:pagefault_disabled
   Cyclomatic Complexity 1 include/linux/sched/signal.h:signal_pending
   Cyclomatic Complexity 2 include/linux/sched/signal.h:test_and_clear_restore_sigmask
   Cyclomatic Complexity 2 include/linux/sched/signal.h:restore_saved_sigmask
   Cyclomatic Complexity 3 include/linux/sched/signal.h:restore_saved_sigmask_unless
   Cyclomatic Complexity 1 include/linux/sched/signal.h:task_rlimit
   Cyclomatic Complexity 1 include/linux/sched/signal.h:rlimit
   Cyclomatic Complexity 2 include/linux/sched/rt.h:rt_prio
   Cyclomatic Complexity 1 include/linux/sched/rt.h:rt_task
   Cyclomatic Complexity 1 include/linux/fs.h:get_file
   Cyclomatic Complexity 8 include/linux/overflow.h:__ab_c_size
   Cyclomatic Complexity 1 include/linux/mm.h:kvmalloc
   Cyclomatic Complexity 1 include/linux/poll.h:init_poll_funcptr
   Cyclomatic Complexity 2 include/linux/poll.h:vfs_poll
   Cyclomatic Complexity 1 include/linux/poll.h:mangle_poll
   Cyclomatic Complexity 1 include/linux/poll.h:demangle_poll
   Cyclomatic Complexity 2 include/linux/file.h:fdput
   Cyclomatic Complexity 1 include/linux/file.h:__to_fd
   Cyclomatic Complexity 1 include/linux/file.h:fdget
   Cyclomatic Complexity 1 include/linux/kasan.h:kasan_kmalloc
   Cyclomatic Complexity 1 include/linux/slab.h:kmalloc_type
   Cyclomatic Complexity 28 include/linux/slab.h:kmalloc_index
   Cyclomatic Complexity 1 include/linux/slab.h:kmem_cache_alloc_trace
   Cyclomatic Complexity 1 include/linux/slab.h:kmalloc_order_trace
   Cyclomatic Complexity 1 include/linux/slab.h:kmalloc_large
   Cyclomatic Complexity 4 include/linux/slab.h:kmalloc
   Cyclomatic Complexity 1 include/linux/compat.h:in_compat_syscall
   Cyclomatic Complexity 1 include/linux/sched/clock.h:local_clock
   Cyclomatic Complexity 1 include/net/busy_poll.h:net_busy_loop_on
   Cyclomatic Complexity 1 include/net/busy_poll.h:busy_loop_current_time
   Cyclomatic Complexity 5 include/net/busy_poll.h:busy_loop_timeout
   Cyclomatic Complexity 4 fs/select.c:__estimate_accuracy
   Cyclomatic Complexity 3 fs/select.c:get_fd_set
   Cyclomatic Complexity 2 fs/select.c:set_fd_set
   Cyclomatic Complexity 1 fs/select.c:zero_fd_set
   Cyclomatic Complexity 9 fs/select.c:max_select_fd
   Cyclomatic Complexity 3 fs/select.c:wait_key_set
   Cyclomatic Complexity 1 fs/select.c:__do_sys_select
   Cyclomatic Complexity 1 fs/select.c:__se_sys_select
   Cyclomatic Complexity 16 fs/select.c:__do_sys_pselect6
   Cyclomatic Complexity 1 fs/select.c:__se_sys_pselect6
   Cyclomatic Complexity 16 fs/select.c:__do_sys_pselect6_time32
   Cyclomatic Complexity 1 fs/select.c:__se_sys_pselect6_time32
   Cyclomatic Complexity 2 fs/select.c:__do_sys_old_select
   Cyclomatic Complexity 1 fs/select.c:__se_sys_old_select
   Cyclomatic Complexity 4 fs/select.c:do_pollfd
   Cyclomatic Complexity 4 fs/select.c:__do_sys_poll
   Cyclomatic Complexity 1 fs/select.c:__se_sys_poll
   Cyclomatic Complexity 1 fs/select.c:__do_sys_ppoll
   Cyclomatic Complexity 1 fs/select.c:__se_sys_ppoll
   Cyclomatic Complexity 1 fs/select.c:__do_sys_ppoll_time32
   Cyclomatic Complexity 1 fs/select.c:__se_sys_ppoll_time32
   Cyclomatic Complexity 1 fs/select.c:__pollwake
   Cyclomatic Complexity 3 fs/select.c:pollwake
   Cyclomatic Complexity 5 fs/select.c:poll_get_entry

vim +/set_compat_user_sigmask +1089 fs/select.c

  1058	
  1059	static int do_sys_ppoll(struct pollfd __user *ufds, unsigned int nfds,
  1060				void __user *tsp, const void __user *sigmask,
  1061				size_t sigsetsize, enum poll_time_type pt_type)
  1062	{
  1063		struct timespec64 ts, end_time, *to = NULL;
  1064		int ret;
  1065	
  1066		if (tsp) {
  1067			switch (pt_type) {
  1068			case PT_TIMESPEC:
  1069				if (get_timespec64(&ts, tsp))
  1070					return -EFAULT;
  1071				break;
  1072			case PT_OLD_TIMESPEC:
  1073				if (get_old_timespec32(&ts, tsp))
  1074					return -EFAULT;
  1075				break;
  1076			default:
  1077				WARN_ON_ONCE(1);
  1078				return -ENOSYS;
  1079			}
  1080	
  1081			to = &end_time;
  1082			if (poll_select_set_timeout(to, ts.tv_sec, ts.tv_nsec))
  1083				return -EINVAL;
  1084		}
  1085	
  1086		if (!in_compat_syscall())
  1087			ret = set_user_sigmask(sigmask, sigsetsize);
  1088		else
> 1089			ret = set_compat_user_sigmask(sigmask, sigsetsize);
  1090	
  1091		if (ret)
  1092			return ret;
  1093	
  1094		ret = do_sys_poll(ufds, nfds, to);
  1095		return poll_select_finish(&end_time, tsp, pt_type, ret);
  1096	}
  1097	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 32361 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 8/9] select/restart_block: Convert poll's timeout to u64
  2019-09-09 13:07   ` David Laight
@ 2019-09-16 15:19     ` Dmitry Safonov
  0 siblings, 0 replies; 19+ messages in thread
From: Dmitry Safonov @ 2019-09-16 15:19 UTC (permalink / raw)
  To: David Laight, linux-kernel
  Cc: Dmitry Safonov, Adrian Reber, Alexander Viro, Andrei Vagin,
	Andy Lutomirski, Cyrill Gorcunov, Ingo Molnar, Oleg Nesterov,
	Pavel Emelyanov, Thomas Gleixner, containers, linux-fsdevel

On 9/9/19 2:07 PM, David Laight wrote:
> From: Dmitry Safonov
>> Sent: 09 September 2019 11:24
>>
>> All preparations have been done - now poll() can set u64 timeout in
>> restart_block. It allows to do the next step - unifying all timeouts in
>> restart_block and provide ptrace() API to read it.
>>
>> Signed-off-by: Dmitry Safonov <dima@arista.com>
>> ---
>>  fs/select.c                   | 27 +++++++--------------------
>>  include/linux/restart_block.h |  4 +---
>>  2 files changed, 8 insertions(+), 23 deletions(-)
>>
>> diff --git a/fs/select.c b/fs/select.c
>> index 4af88feaa2fe..ff2b9c4865cd 100644
>> --- a/fs/select.c
>> +++ b/fs/select.c
> ...
>> @@ -1037,16 +1030,10 @@ SYSCALL_DEFINE3(poll, struct pollfd __user *, ufds, unsigned int, nfds,
>>  		struct restart_block *restart_block;
>>
>>  		restart_block = &current->restart_block;
>> -		restart_block->fn = do_restart_poll;
>> -		restart_block->poll.ufds = ufds;
>> -		restart_block->poll.nfds = nfds;
>> -
>> -		if (timeout_msecs >= 0) {
>> -			restart_block->poll.tv_sec = end_time.tv_sec;
>> -			restart_block->poll.tv_nsec = end_time.tv_nsec;
>> -			restart_block->poll.has_timeout = 1;
>> -		} else
>> -			restart_block->poll.has_timeout = 0;
>> +		restart_block->fn		= do_restart_poll;
>> +		restart_block->poll.ufds	= ufds;
>> +		restart_block->poll.nfds	= nfds;
>> +		restart_block->poll.timeout	= timeout;
> 
> What is all that whitespace for?

Aligned them with tabs just to make it look better.
I've no hard feelings about this - I can do it with spaces or drop the
align at all.

Thanks,
          Dmitry

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/9] select: Micro-optimise __estimate_accuracy()
  2019-09-09 10:23 ` [PATCH 4/9] select: Micro-optimise __estimate_accuracy() Dmitry Safonov
  2019-09-09 11:18   ` Cyrill Gorcunov
@ 2019-09-19 14:05   ` Cyrill Gorcunov
  2019-09-19 14:25     ` Dmitry Safonov
  1 sibling, 1 reply; 19+ messages in thread
From: Cyrill Gorcunov @ 2019-09-19 14:05 UTC (permalink / raw)
  To: Dmitry Safonov
  Cc: linux-kernel, Dmitry Safonov, Adrian Reber, Alexander Viro,
	Andrei Vagin, Andy Lutomirski, Ingo Molnar, Oleg Nesterov,
	Pavel Emelyanov, Thomas Gleixner, containers, linux-fsdevel

On Mon, Sep 09, 2019 at 11:23:35AM +0100, Dmitry Safonov wrote:
> Shift on s64 is faster than division, use it instead.
> 
> As the result of the patch there is a hardly user-visible effect:
> poll(), select(), etc syscalls will be a bit more precise on ~2.3%
> than before because 1000 != 1024 :)
> 
> Signed-off-by: Dmitry Safonov <dima@arista.com>
> ---
>  fs/select.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/select.c b/fs/select.c
> index 12cdefd3be2d..2477c202631e 100644
> --- a/fs/select.c
> +++ b/fs/select.c
> @@ -51,15 +51,14 @@
>  
>  static long __estimate_accuracy(ktime_t slack)
>  {
> -	int divfactor = 1000;
> -
>  	if (slack < 0)
>  		return 0;

Btw, don't you better use <= here?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/9] select: Micro-optimise __estimate_accuracy()
  2019-09-19 14:05   ` Cyrill Gorcunov
@ 2019-09-19 14:25     ` Dmitry Safonov
  0 siblings, 0 replies; 19+ messages in thread
From: Dmitry Safonov @ 2019-09-19 14:25 UTC (permalink / raw)
  To: Cyrill Gorcunov, Dmitry Safonov
  Cc: linux-kernel, Adrian Reber, Alexander Viro, Andrei Vagin,
	Andy Lutomirski, Ingo Molnar, Oleg Nesterov, Pavel Emelyanov,
	Thomas Gleixner, containers, linux-fsdevel

On 9/19/19 3:05 PM, Cyrill Gorcunov wrote:
[..]
>> diff --git a/fs/select.c b/fs/select.c
>> index 12cdefd3be2d..2477c202631e 100644
>> --- a/fs/select.c
>> +++ b/fs/select.c
>> @@ -51,15 +51,14 @@
>>  
>>  static long __estimate_accuracy(ktime_t slack)
>>  {
>> -	int divfactor = 1000;
>> -
>>  	if (slack < 0)
>>  		return 0;
> 
> Btw, don't you better use <= here?
> 

Good point, will do for v2.

Thanks,
          Dmitry

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2019-09-19 14:25 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-09 10:23 [PATCH 0/9] restart_block: Prepare the ground for dumping timeout Dmitry Safonov
2019-09-09 10:23 ` [PATCH 1/9] futex: Remove unused uaddr2 in restart_block Dmitry Safonov
2019-09-09 10:23 ` [PATCH 2/9] restart_block: Prevent userspace set part of the block Dmitry Safonov
2019-09-09 10:23 ` [PATCH 3/9] select: Convert __esimate_accuracy() to ktime_t Dmitry Safonov
2019-09-09 10:23 ` [PATCH 4/9] select: Micro-optimise __estimate_accuracy() Dmitry Safonov
2019-09-09 11:18   ` Cyrill Gorcunov
2019-09-09 11:50     ` Dmitry Safonov
2019-09-09 12:14       ` Cyrill Gorcunov
2019-09-19 14:05   ` Cyrill Gorcunov
2019-09-19 14:25     ` Dmitry Safonov
2019-09-09 10:23 ` [PATCH 5/9] select: Convert select_estimate_accuracy() to take ktime_t Dmitry Safonov
2019-09-09 10:23 ` [PATCH 6/9] select: Extract common code into do_sys_ppoll() Dmitry Safonov
2019-09-09 11:15   ` kbuild test robot
2019-09-09 19:48   ` kbuild test robot
2019-09-09 10:23 ` [PATCH 7/9] select: Use ktime_t in do_sys_poll() and do_poll() Dmitry Safonov
2019-09-09 10:23 ` [PATCH 8/9] select/restart_block: Convert poll's timeout to u64 Dmitry Safonov
2019-09-09 13:07   ` David Laight
2019-09-16 15:19     ` Dmitry Safonov
2019-09-09 10:23 ` [PATCH 9/9] restart_block: Make common timeout Dmitry Safonov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).