All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/3] time namespace aware system boot time
@ 2020-10-19 19:52 Michael Weiß
  2020-10-19 19:52 ` [PATCH v4 1/3] timens: additional helper functions for boottime offset handling Michael Weiß
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: Michael Weiß @ 2020-10-19 19:52 UTC (permalink / raw)
  To: Thomas Gleixner, Andrei Vagin, Christian Brauner
  Cc: Dmitry Safonov, linux-kernel, J . Bruce Fields, Chuck Lever,
	Trond Myklebust, Anna Schumaker, Michael Weiß

Time namespaces make it possible to virtualize time inside of
containers, e.g., it is feasible to reset the uptime of a container
to zero by setting the time namespace offset for boottime to the
negated current value of the CLOCK_BOOTTIME.

However, the boot time stamp provided by getboottime64() does not
take care of time namespaces. The resulting boot time stamp 'btime'
provided by /proc/stat does not show a plausible time stamp inside
the time namespace of a container.

We address this by shifting the value returned by getboottime64()
by subtracting the boottime offset of the time namespace.
(A selftest to check the expected /proc/stat 'btime' inside the
namespace is provided.)

Further, to avoid to show processes as time travelers inside of the
time namespace the boottime offset then needs to be added to the
start_boottime provided by the task_struct.

v4 Changes:
Avoid type conversions back and forth between timespec64 and ktime_t
in 'proc/stat.c' as suggested by Andrei.
Introduced timens_sub_boottime() in 'time_namespace.h' to provide
better coder readability/consistency.

v3 Changes:
leave getboottime64() unchanged and shift the boot timestamp in
'fs/proc/stat.c' as result of the discussion with Andrei and Thomas.

v2 Changes:
Fixed compile errors with TIME_NS not set in config

Michael Weiß (3):
  timens: additional helper functions for boottime offset handling
  fs/proc: apply the time namespace offset to /proc/stat btime
  selftests/timens: added selftest for /proc/stat btime

 fs/proc/array.c                         |  6 ++-
 fs/proc/stat.c                          |  3 ++
 include/linux/time_namespace.h          | 22 ++++++++++
 tools/testing/selftests/timens/procfs.c | 58 ++++++++++++++++++++++++-
 4 files changed, 86 insertions(+), 3 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v4 1/3] timens: additional helper functions for boottime offset handling
  2020-10-19 19:52 [PATCH v4 0/3] time namespace aware system boot time Michael Weiß
@ 2020-10-19 19:52 ` Michael Weiß
  2020-10-19 19:52 ` [PATCH v4 2/3] fs/proc: apply the time namespace offset to /proc/stat btime Michael Weiß
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Michael Weiß @ 2020-10-19 19:52 UTC (permalink / raw)
  To: Thomas Gleixner, Andrei Vagin, Christian Brauner
  Cc: Dmitry Safonov, linux-kernel, J . Bruce Fields, Chuck Lever,
	Trond Myklebust, Anna Schumaker, Michael Weiß

Provide functions for time_namespace to subtract the boottime offset
from a timespec64 as well as to apply the boottime offset to u64 types in
nanoseconds.

Signed-off-by: Michael Weiß <michael.weiss@aisec.fraunhofer.de>
---
 include/linux/time_namespace.h | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/include/linux/time_namespace.h b/include/linux/time_namespace.h
index 5b6031385db0..68770ac9ba89 100644
--- a/include/linux/time_namespace.h
+++ b/include/linux/time_namespace.h
@@ -77,6 +77,20 @@ static inline void timens_add_boottime(struct timespec64 *ts)
 	*ts = timespec64_add(*ts, ns_offsets->boottime);
 }
 
+static inline u64 timens_add_boottime_ns(u64 nsec)
+{
+	struct timens_offsets *ns_offsets = &current->nsproxy->time_ns->offsets;
+
+	return nsec + timespec64_to_ns(&ns_offsets->boottime);
+}
+
+static inline void timens_sub_boottime(struct timespec64 *ts)
+{
+	struct timens_offsets *ns_offsets = &current->nsproxy->time_ns->offsets;
+
+	*ts = timespec64_sub(*ts, ns_offsets->boottime);
+}
+
 ktime_t do_timens_ktime_to_host(clockid_t clockid, ktime_t tim,
 				struct timens_offsets *offsets);
 
@@ -130,6 +144,14 @@ static inline int timens_on_fork(struct nsproxy *nsproxy,
 
 static inline void timens_add_monotonic(struct timespec64 *ts) { }
 static inline void timens_add_boottime(struct timespec64 *ts) { }
+
+static inline u64 timens_add_boottime_ns(u64 nsec)
+{
+	return nsec;
+}
+
+static inline void timens_sub_boottime(struct timespec64 *ts) { }
+
 static inline ktime_t timens_ktime_to_host(clockid_t clockid, ktime_t tim)
 {
 	return tim;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v4 2/3] fs/proc: apply the time namespace offset to /proc/stat btime
  2020-10-19 19:52 [PATCH v4 0/3] time namespace aware system boot time Michael Weiß
  2020-10-19 19:52 ` [PATCH v4 1/3] timens: additional helper functions for boottime offset handling Michael Weiß
@ 2020-10-19 19:52 ` Michael Weiß
  2020-10-26 10:28   ` Thomas Gleixner
  2020-10-19 19:52 ` [PATCH v4 3/3] selftests/timens: added selftest for " Michael Weiß
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 8+ messages in thread
From: Michael Weiß @ 2020-10-19 19:52 UTC (permalink / raw)
  To: Thomas Gleixner, Andrei Vagin, Christian Brauner
  Cc: Dmitry Safonov, linux-kernel, J . Bruce Fields, Chuck Lever,
	Trond Myklebust, Anna Schumaker, Michael Weiß

'/proc/stat' provides the field 'btime' which states the time stamp of
system boot in seconds. In case of time namespaces, the offset to the
boot time stamp was not applied earlier. However, in container
runtimes which utilize time namespaces to virtualize boottime of a
container, this leaks information about the host system boot time.

Therefore, we make procfs to virtualize also the btime field by
subtracting the offset of the timens boottime from 'btime' before
printing the stats.

Since start_boottime of processes are seconds since boottime and the
boottime stamp is now shifted according to the timens offset, the
offset of the time namespace also needs to be applied before the
process stats are given to userspace.

This avoids that processes shown, e.g., by 'ps' appear as time
travelers in the corresponding time namespace.

Signed-off-by: Michael Weiß <michael.weiss@aisec.fraunhofer.de>
---
 fs/proc/array.c | 6 ++++--
 fs/proc/stat.c  | 3 +++
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/proc/array.c b/fs/proc/array.c
index 65ec2029fa80..277f654f289e 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -56,6 +56,7 @@
 #include <linux/types.h>
 #include <linux/errno.h>
 #include <linux/time.h>
+#include <linux/time_namespace.h>
 #include <linux/kernel.h>
 #include <linux/kernel_stat.h>
 #include <linux/tty.h>
@@ -533,8 +534,9 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
 	priority = task_prio(task);
 	nice = task_nice(task);
 
-	/* convert nsec -> ticks */
-	start_time = nsec_to_clock_t(task->start_boottime);
+	/* apply timens offset for boottime and convert nsec -> ticks */
+	start_time =
+		nsec_to_clock_t(timens_add_boottime_ns(task->start_boottime));
 
 	seq_put_decimal_ull(m, "", pid_nr_ns(pid, ns));
 	seq_puts(m, " (");
diff --git a/fs/proc/stat.c b/fs/proc/stat.c
index 46b3293015fe..9df128ea9417 100644
--- a/fs/proc/stat.c
+++ b/fs/proc/stat.c
@@ -10,6 +10,7 @@
 #include <linux/seq_file.h>
 #include <linux/slab.h>
 #include <linux/time.h>
+#include <linux/time_namespace.h>
 #include <linux/irqnr.h>
 #include <linux/sched/cputime.h>
 #include <linux/tick.h>
@@ -118,6 +119,8 @@ static int show_stat(struct seq_file *p, void *v)
 		irq = softirq = steal = 0;
 	guest = guest_nice = 0;
 	getboottime64(&boottime);
+	/* shift boot timestamp according to the timens offset */
+	timens_sub_boottime(&boottime);
 
 	for_each_possible_cpu(i) {
 		struct kernel_cpustat kcpustat;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v4 3/3] selftests/timens: added selftest for /proc/stat btime
  2020-10-19 19:52 [PATCH v4 0/3] time namespace aware system boot time Michael Weiß
  2020-10-19 19:52 ` [PATCH v4 1/3] timens: additional helper functions for boottime offset handling Michael Weiß
  2020-10-19 19:52 ` [PATCH v4 2/3] fs/proc: apply the time namespace offset to /proc/stat btime Michael Weiß
@ 2020-10-19 19:52 ` Michael Weiß
  2020-10-22  8:36 ` [PATCH v4 0/3] time namespace aware system boot time Andrei Vagin
  2020-10-26 10:27 ` Thomas Gleixner
  4 siblings, 0 replies; 8+ messages in thread
From: Michael Weiß @ 2020-10-19 19:52 UTC (permalink / raw)
  To: Thomas Gleixner, Andrei Vagin, Christian Brauner
  Cc: Dmitry Safonov, linux-kernel, J . Bruce Fields, Chuck Lever,
	Trond Myklebust, Anna Schumaker, Michael Weiß

Test that btime value of /proc/stat is as expected in the time namespace
using a simple parser to get btime from /proc/stat.

Signed-off-by: Michael Weiß <michael.weiss@aisec.fraunhofer.de>
---
 tools/testing/selftests/timens/procfs.c | 58 ++++++++++++++++++++++++-
 1 file changed, 57 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/timens/procfs.c b/tools/testing/selftests/timens/procfs.c
index 7f14f0fdac84..f2519154208a 100644
--- a/tools/testing/selftests/timens/procfs.c
+++ b/tools/testing/selftests/timens/procfs.c
@@ -93,6 +93,33 @@ static int read_proc_uptime(struct timespec *uptime)
 	return 0;
 }
 
+static int read_proc_stat_btime(unsigned long long *boottime_sec)
+{
+	FILE *proc;
+	char line_buf[2048];
+
+	proc = fopen("/proc/stat", "r");
+	if (proc == NULL) {
+		pr_perror("Unable to open /proc/stat");
+		return -1;
+	}
+
+	while (fgets(line_buf, 2048, proc)) {
+		if (sscanf(line_buf, "btime %llu", boottime_sec) != 1)
+			continue;
+		fclose(proc);
+		return 0;
+	}
+	if (errno) {
+		pr_perror("fscanf");
+		fclose(proc);
+		return -errno;
+	}
+	pr_err("failed to parse /proc/stat");
+	fclose(proc);
+	return -1;
+}
+
 static int check_uptime(void)
 {
 	struct timespec uptime_new, uptime_old;
@@ -123,18 +150,47 @@ static int check_uptime(void)
 	return 0;
 }
 
+static int check_stat_btime(void)
+{
+	unsigned long long btime_new, btime_old;
+	unsigned long long btime_expected;
+
+	if (switch_ns(parent_ns))
+		return pr_err("switch_ns(%d)", parent_ns);
+
+	if (read_proc_stat_btime(&btime_old))
+		return 1;
+
+	if (switch_ns(child_ns))
+		return pr_err("switch_ns(%d)", child_ns);
+
+	if (read_proc_stat_btime(&btime_new))
+		return 1;
+
+	btime_expected = btime_old - TEN_DAYS_IN_SEC;
+	if (btime_new != btime_expected) {
+		pr_fail("btime in /proc/stat: old %llu, new %llu [%llu]",
+			btime_old, btime_new, btime_expected);
+		return 1;
+	}
+
+	ksft_test_result_pass("Passed for /proc/stat btime\n");
+	return 0;
+}
+
 int main(int argc, char *argv[])
 {
 	int ret = 0;
 
 	nscheck();
 
-	ksft_set_plan(1);
+	ksft_set_plan(2);
 
 	if (init_namespaces())
 		return 1;
 
 	ret |= check_uptime();
+	ret |= check_stat_btime();
 
 	if (ret)
 		ksft_exit_fail();
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 0/3] time namespace aware system boot time
  2020-10-19 19:52 [PATCH v4 0/3] time namespace aware system boot time Michael Weiß
                   ` (2 preceding siblings ...)
  2020-10-19 19:52 ` [PATCH v4 3/3] selftests/timens: added selftest for " Michael Weiß
@ 2020-10-22  8:36 ` Andrei Vagin
  2020-10-26 10:27 ` Thomas Gleixner
  4 siblings, 0 replies; 8+ messages in thread
From: Andrei Vagin @ 2020-10-22  8:36 UTC (permalink / raw)
  To: Michael Weiß
  Cc: Thomas Gleixner, Christian Brauner, Dmitry Safonov, linux-kernel,
	J . Bruce Fields, Chuck Lever, Trond Myklebust, Anna Schumaker

On Mon, Oct 19, 2020 at 09:52:54PM +0200, Michael Weiß wrote:
> Time namespaces make it possible to virtualize time inside of
> containers, e.g., it is feasible to reset the uptime of a container
> to zero by setting the time namespace offset for boottime to the
> negated current value of the CLOCK_BOOTTIME.
> 
> However, the boot time stamp provided by getboottime64() does not
> take care of time namespaces. The resulting boot time stamp 'btime'
> provided by /proc/stat does not show a plausible time stamp inside
> the time namespace of a container.
> 
> We address this by shifting the value returned by getboottime64()
> by subtracting the boottime offset of the time namespace.
> (A selftest to check the expected /proc/stat 'btime' inside the
> namespace is provided.)
> 
> Further, to avoid to show processes as time travelers inside of the
> time namespace the boottime offset then needs to be added to the
> start_boottime provided by the task_struct.
> 
> v4 Changes:
> Avoid type conversions back and forth between timespec64 and ktime_t
> in 'proc/stat.c' as suggested by Andrei.
> Introduced timens_sub_boottime() in 'time_namespace.h' to provide
> better coder readability/consistency.
> 

Reviewed-by: Andrei Vagin <avagin@gmail.com>

Thanks,
Andrei

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 0/3] time namespace aware system boot time
  2020-10-19 19:52 [PATCH v4 0/3] time namespace aware system boot time Michael Weiß
                   ` (3 preceding siblings ...)
  2020-10-22  8:36 ` [PATCH v4 0/3] time namespace aware system boot time Andrei Vagin
@ 2020-10-26 10:27 ` Thomas Gleixner
  4 siblings, 0 replies; 8+ messages in thread
From: Thomas Gleixner @ 2020-10-26 10:27 UTC (permalink / raw)
  To: Michael Weiß, Andrei Vagin, Christian Brauner
  Cc: Dmitry Safonov, linux-kernel, J . Bruce Fields, Chuck Lever,
	Trond Myklebust, Anna Schumaker, Michael Weiß

On Mon, Oct 19 2020 at 21:52, Michael Weiß wrote:
> Michael Weiß (3):
>   timens: additional helper functions for boottime offset handling
>   fs/proc: apply the time namespace offset to /proc/stat btime
>   selftests/timens: added selftest for /proc/stat btime
>
>  fs/proc/array.c                         |  6 ++-
>  fs/proc/stat.c                          |  3 ++
>  include/linux/time_namespace.h          | 22 ++++++++++
>  tools/testing/selftests/timens/procfs.c | 58 ++++++++++++++++++++++++-
>  4 files changed, 86 insertions(+), 3 deletions(-)

Acked-by: Thomas Gleixner <tglx@linutronix.de>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 2/3] fs/proc: apply the time namespace offset to /proc/stat btime
  2020-10-19 19:52 ` [PATCH v4 2/3] fs/proc: apply the time namespace offset to /proc/stat btime Michael Weiß
@ 2020-10-26 10:28   ` Thomas Gleixner
  2020-10-27 11:56     ` Michael Weiß
  0 siblings, 1 reply; 8+ messages in thread
From: Thomas Gleixner @ 2020-10-26 10:28 UTC (permalink / raw)
  To: Michael Weiß, Andrei Vagin, Christian Brauner
  Cc: Dmitry Safonov, linux-kernel, J . Bruce Fields, Chuck Lever,
	Trond Myklebust, Anna Schumaker, Michael Weiß

On Mon, Oct 19 2020 at 21:52, Michael Weiß wrote:

> '/proc/stat' provides the field 'btime' which states the time stamp of
> system boot in seconds. In case of time namespaces, the offset to the
> boot time stamp was not applied earlier. However, in container
> runtimes which utilize time namespaces to virtualize boottime of a
> container, this leaks information about the host system boot time.

Not sure if that qualifies as a leak. The point is that it confuses the
tasks which are in a different time universe.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4 2/3] fs/proc: apply the time namespace offset to /proc/stat btime
  2020-10-26 10:28   ` Thomas Gleixner
@ 2020-10-27 11:56     ` Michael Weiß
  0 siblings, 0 replies; 8+ messages in thread
From: Michael Weiß @ 2020-10-27 11:56 UTC (permalink / raw)
  To: Thomas Gleixner, Andrei Vagin, Christian Brauner
  Cc: Dmitry Safonov, linux-kernel, J . Bruce Fields, Chuck Lever,
	Trond Myklebust, Anna Schumaker

Thomas,

On 26.10.20 11:28, Thomas Gleixner wrote:
> On Mon, Oct 19 2020 at 21:52, Michael Weiß wrote:
> 
>> '/proc/stat' provides the field 'btime' which states the time stamp of
>> system boot in seconds. In case of time namespaces, the offset to the
>> boot time stamp was not applied earlier. However, in container
>> runtimes which utilize time namespaces to virtualize boottime of a
>> container, this leaks information about the host system boot time.
> 
> Not sure if that qualifies as a leak. The point is that it confuses the
> tasks which are in a different time universe.
> 
From a container isolation perspective this could be interpreted as an
information leak, if you want to hide the host systems boot time form
users inside of the container.
However, since proc is not totally isolated concerning host system
information inside of namespaces, I agree to your point.

I will take this into account when preparing v5.
When amending the tags I will also updated this passage and remove the
leak statement.

Thanks,
Michael

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-10-27 11:56 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-19 19:52 [PATCH v4 0/3] time namespace aware system boot time Michael Weiß
2020-10-19 19:52 ` [PATCH v4 1/3] timens: additional helper functions for boottime offset handling Michael Weiß
2020-10-19 19:52 ` [PATCH v4 2/3] fs/proc: apply the time namespace offset to /proc/stat btime Michael Weiß
2020-10-26 10:28   ` Thomas Gleixner
2020-10-27 11:56     ` Michael Weiß
2020-10-19 19:52 ` [PATCH v4 3/3] selftests/timens: added selftest for " Michael Weiß
2020-10-22  8:36 ` [PATCH v4 0/3] time namespace aware system boot time Andrei Vagin
2020-10-26 10:27 ` Thomas Gleixner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.