From: Adrian Reber <adrian@lisas.de>
To: Christian Brauner <christian.brauner@ubuntu.com>
Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>,
Dmitry Safonov <dima@arista.com>,
Linux Kernel <linux-kernel@vger.kernel.org>,
Dmitry Safonov <0x7f454c46@gmail.com>,
Andy Lutomirski <luto@kernel.org>, Arnd Bergmann <arnd@arndb.de>,
Cyrill Gorcunov <gorcunov@openvz.org>,
"Eric W. Biederman" <ebiederm@xmission.com>,
"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
Jann Horn <jannh@google.com>, Jeff Dike <jdike@addtoit.com>,
Oleg Nesterov <oleg@redhat.com>,
Pavel Emelyanov <ovzxemul@gmail.com>,
Shuah Khan <shuah@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Vincenzo Frascino <vincenzo.frascino@arm.com>,
containers <containers@lists.linux-foundation.org>,
criu@openvz.org, Linux API <linux-api@vger.kernel.org>,
x86@kernel.org, Andrei Vagin <avagin@gmail.com>
Subject: Re: Time Namespaces: CLONE_NEWTIME and clone3()?
Date: Tue, 18 Feb 2020 18:11:26 +0100 [thread overview]
Message-ID: <20200218171126.GF2322403@lisas.de> (raw)
In-Reply-To: <20200217230331.he6p5bs766zp6smx@wittgenstein>
On Tue, Feb 18, 2020 at 12:03:31AM +0100, Christian Brauner wrote:
> On Mon, Feb 17, 2020 at 10:47:53PM +0100, Michael Kerrisk (man-pages) wrote:
> > Hello Christian,
> >
> > On Mon, 17 Feb 2020 at 16:15, Christian Brauner
> > <christian.brauner@ubuntu.com> wrote:
> > >
> > > On Mon, Feb 17, 2020 at 03:20:55PM +0100, Michael Kerrisk wrote:
> > > > Hello Dmitry, Andrei,
> > > >
> > > > Is the CLONE_NEWTIME flag intended to be usable with clone3()? The
> > > > mail quoted below implies (in my reading) that this should be possible
> > > > once clone3() is available, which it is by now. (See also [1].)
> > > >
> > > > If the answer is yes, CLONE_NEWTIME should be usable with clone3(),
> > > > then I have a bug report and a question.
> > > >
> > > > I successfully used CLONE_NEWTIME with unshare(). But if I try to use
> > > > CLONE_NEWSIGNAL with clone3(), it errors out with EINVAL, because of
> > >
> > > s/CLONE_NEWSIGNAL/CLONE_NEWTIME/
> > >
> > > > the following check in clone3_args_valid():
> > > >
> > > > /*
> > > > * - make the CLONE_DETACHED bit reuseable for clone3
> > > > * - make the CSIGNAL bits reuseable for clone3
> > > > */
> > > > if (kargs->flags & (CLONE_DETACHED | CSIGNAL))
> > > > return false;
> > > >
> > > > The problem is that CLONE_NEWTIME matches one of the bits in the
> > > > CSIGNAL mask. If the intention is to allow CLONE_NEWTIME with
> > > > clone3(), then either the bit needs to be redefined, or the error
> > > > checking in clone3_args_valid() needs to be reworked.
> > >
> > > If this is intended to be useable with clone3() the check should be
> > > adapted to allow for CLONE_NEWTIME. (I asked about this a while ago I
> > > think.)
> > > But below rather sounds like it should simply be an unshare() flag. The
> > > code seems to set frozen_offsets to true right after copy_namespaces()
> > > in timens_on_fork(new_ns, tsk) and so the offsets can't be changed
> > > anymore unless I'm reading this wrong.
> > > Alternatives seem to either make timens_offsets writable once after fork
> > > and before exec, I guess - though that's probably not going to work
> > > with the vdso judging from timens_on_fork().
> > >
> > > The other alternative is that Andrei and Dmitry send me a patch to
> > > enable CLONE_NEWTIME with clone3() by exposing struct timens_offsets (or
> > > a version of it) in the uapi and extend struct clone_args to include a
> > > pointer to a struct timens_offset that is _only_ set when CLONE_NEWTIME
> > > is set.
> > > Though the unshare() way sounds way less invasive simpler.
> >
> > Actually, I think the alternative you propose just here is better. I
> > imagine there are times when one will want to create multiple
> > namespaces with a single call to clone3(), including a time namespace.
> > I think this should be allowed by the API. And, otherwise, clone3()
> > becomes something of a second-class citizen for creating namespaces.
> > (I don't really get the "less invasive" argument. Implementing this is
> > just a piece of kernel to code to make user-space's life a bit simpler
> > and more consistent.)
>
> I don't particularly mind either way. If there's actual users that need
> to set it at clone3() time then we can extend it. So I'd like to hear
> what Adrian, Dmitry, and Thomas think since they are well-versed how
> this will be used in the wild. I'm weary of exposing a whole new uapi
> struct and extending clone3() without any real use-case but I'm happy to
> if there is!
Re-creating a time namespace during restore via clone3() would be CRIU's
preferred way of doing this. If available CRIU is already using
clone3(), especially because of set_tid, and for the time namespace we
would also rely on clone3() to re-create it.
I will provide a patch to extend clone3() to handle the creation of a
new time namespace.
Adrian
next prev parent reply other threads:[~2020-02-18 17:20 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-12 1:26 [PATCHv8 00/34] kernel: Introduce Time Namespace Dmitry Safonov
2019-11-12 1:26 ` [PATCHv8 01/34] lib/vdso: Add unlikely() hint into vdso_read_begin() Dmitry Safonov
2019-11-12 1:26 ` [PATCHv8 02/34] lib/vdso: make do_hres and do_coarse as __always_inline Dmitry Safonov
[not found] ` <20191112012724.250792-3-dima-nzgTgzXrdUbQT0dZR+AlfA@public.gmane.org>
2020-01-10 9:45 ` Vincenzo Frascino
2020-01-10 11:42 ` Thomas Gleixner
[not found] ` <878smfa66i.fsf-ecDvlHI5BZPZikZi3RtOZ1XZhhPuCNm+@public.gmane.org>
2020-01-10 11:47 ` Vincenzo Frascino
2020-01-10 12:02 ` Thomas Gleixner
[not found] ` <875zhja59q.fsf-ecDvlHI5BZPZikZi3RtOZ1XZhhPuCNm+@public.gmane.org>
2020-01-10 12:18 ` Vincenzo Frascino
2020-01-13 5:27 ` Andrei Vagin
2019-11-12 1:26 ` [PATCHv8 03/34] ns: Introduce Time Namespace Dmitry Safonov
[not found] ` <20191112012724.250792-4-dima-nzgTgzXrdUbQT0dZR+AlfA@public.gmane.org>
2020-01-27 14:12 ` Dmitry Vyukov
[not found] ` <CACT4Y+b70bRRS2XD3yxhBoy4E-LFy_K3wMrjeuPmiEvaPe_c2Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2020-01-27 14:19 ` Dmitry Safonov
2020-02-17 14:20 ` Time Namespaces: CLONE_NEWTIME and clone3()? Michael Kerrisk
2020-02-17 14:59 ` Christian Brauner
2020-02-17 21:47 ` Michael Kerrisk (man-pages)
2020-02-17 23:03 ` Christian Brauner
2020-02-17 23:29 ` Thomas Gleixner
2020-02-18 2:37 ` Eric W. Biederman
2020-02-18 17:11 ` Adrian Reber [this message]
2020-02-18 17:26 ` Christian Brauner
2019-11-12 1:26 ` [PATCHv8 04/34] time: Add timens_offsets to be used for tasks in timens Dmitry Safonov
2019-11-12 1:26 ` [PATCHv8 05/34] posix-clocks: Rename the clock_get() callback to clock_get_timespec() Dmitry Safonov
2019-11-12 1:26 ` [PATCHv8 06/34] posix-clocks: Rename .clock_get_timespec() callbacks accordingly Dmitry Safonov
2019-11-12 1:26 ` [PATCHv8 07/34] alarmtimer: Rename gettime() callback to get_ktime() Dmitry Safonov
2019-11-12 1:26 ` [PATCHv8 08/34] alarmtimer: Provide get_timespec() callback Dmitry Safonov
2019-11-12 1:26 ` [PATCHv8 09/34] posix-clocks: Introduce clock_get_ktime() callback Dmitry Safonov
2019-11-12 1:26 ` [PATCHv8 10/34] posix-timers: Use clock_get_ktime() in common_timer_get() Dmitry Safonov
2019-11-12 1:27 ` [PATCHv8 11/34] posix-clocks: Wire up clock_gettime() with timens offsets Dmitry Safonov
2019-11-12 1:27 ` [PATCHv8 12/34] kernel: Add do_timens_ktime_to_host() helper Dmitry Safonov
2019-11-12 1:27 ` [PATCHv8 13/34] timerfd: Make timerfd_settime() time namespace aware Dmitry Safonov
2019-11-12 1:27 ` [PATCHv8 14/34] posix-timers: Make timer_settime() " Dmitry Safonov
2019-11-12 1:27 ` [PATCHv8 15/34] alarmtimer: Make nanosleep " Dmitry Safonov
2019-11-12 1:27 ` [PATCHv8 16/34] hrtimers: Prepare hrtimer_nanosleep() for time namespaces Dmitry Safonov
2019-11-12 1:27 ` [PATCHv8 17/34] posix-timers: Make clock_nanosleep() time namespace aware Dmitry Safonov
2019-11-12 1:27 ` [PATCHv8 18/34] fs/proc: Respect boottime inside time namespace for /proc/uptime Dmitry Safonov
2019-11-12 1:27 ` [PATCHv8 19/34] x86/vdso: Restrict splitting VVAR VMA Dmitry Safonov
2019-11-12 1:27 ` [PATCHv8 20/34] lib/vdso: Prepare for time namespace support Dmitry Safonov
[not found] ` <20191112012724.250792-21-dima-nzgTgzXrdUbQT0dZR+AlfA@public.gmane.org>
2020-01-12 10:32 ` Thomas Gleixner
2019-11-12 1:27 ` [PATCHv8 21/34] x86/vdso: Provide vdso_data offset on vvar_page Dmitry Safonov
2019-11-12 1:27 ` [PATCHv8 22/34] x86/vdso: Add timens page Dmitry Safonov
2019-11-12 1:27 ` [PATCHv8 23/34] time: Allocate per-timens vvar page Dmitry Safonov
2019-11-12 1:27 ` [PATCHv8 24/34] x86/vdso: Handle faults on timens page Dmitry Safonov
2019-11-12 1:27 ` [PATCHv8 25/34] x86/vdso: On timens page fault prefault also VVAR page Dmitry Safonov
2019-11-12 1:27 ` [PATCHv8 26/34] x86/vdso: Zap vvar pages on switch a time namspace Dmitry Safonov
2019-11-12 1:27 ` [PATCHv8 27/34] fs/proc: Introduce /proc/pid/timens_offsets Dmitry Safonov
2019-11-12 1:27 ` [PATCHv8 28/34] selftests/timens: Add Time Namespace test for supported clocks Dmitry Safonov
2019-11-12 1:27 ` [PATCHv8 29/34] selftests/timens: Add a test for timerfd Dmitry Safonov
2019-11-12 1:27 ` [PATCHv8 30/34] selftests/timens: Add a test for clock_nanosleep() Dmitry Safonov
2019-11-12 1:27 ` [PATCHv8 31/34] selftests/timens: Add procfs selftest Dmitry Safonov
2019-11-12 1:27 ` [PATCHv8 32/34] selftests/timens: Add timer offsets test Dmitry Safonov
2019-11-12 1:27 ` [PATCHv8 33/34] selftests/timens: Add a simple perf test for clock_gettime() Dmitry Safonov
2019-11-12 1:27 ` [PATCHv8 34/34] selftests/timens: Check for right timens offsets after fork and exec Dmitry Safonov
2019-11-21 18:05 ` [PATCHv8 00/34] kernel: Introduce Time Namespace Andrei Vagin
2019-12-11 20:38 ` Dmitry Safonov
2020-01-09 21:09 ` Thomas Gleixner
2020-01-10 9:52 ` Vincenzo Frascino
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200218171126.GF2322403@lisas.de \
--to=adrian@lisas.de \
--cc=0x7f454c46@gmail.com \
--cc=arnd@arndb.de \
--cc=avagin@gmail.com \
--cc=christian.brauner@ubuntu.com \
--cc=containers@lists.linux-foundation.org \
--cc=criu@openvz.org \
--cc=dima@arista.com \
--cc=ebiederm@xmission.com \
--cc=gorcunov@openvz.org \
--cc=hpa@zytor.com \
--cc=jannh@google.com \
--cc=jdike@addtoit.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mingo@redhat.com \
--cc=mtk.manpages@gmail.com \
--cc=oleg@redhat.com \
--cc=ovzxemul@gmail.com \
--cc=shuah@kernel.org \
--cc=tglx@linutronix.de \
--cc=vincenzo.frascino@arm.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).