All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Alexander Graf <graf@amazon.com>
Cc: Jann Horn <jannh@google.com>, Pavel Machek <pavel@ucw.cz>,
	"Catangiu, Adrian Costin" <acatan@amazon.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>,
	"virtualization@lists.linux-foundation.org" 
	<virtualization@lists.linux-foundation.org>,
	"linux-api@vger.kernel.org" <linux-api@vger.kernel.org>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"rjw@rjwysocki.net" <rjw@rjwysocki.net>,
	"len.brown@intel.com" <len.brown@intel.com>,
	"fweimer@redhat.com" <fweimer@redhat.com>,
	"keescook@chromium.org" <keescook@chromium.org>,
	"luto@amacapital.net" <luto@amacapital.net>,
	"wad@chromium.org" <wad@chromium.org>,
	"mingo@kernel.org" <mingo@kernel.org>,
	"bonzini@gnu.org" <bonzini@gnu.org>,
	"MacCarthaigh, Colm" <colmmacc@amazon.com>,
	"Singh, Balbir" <sblbir@amazon.com>,
	"Sandu, Andrei" <sandreim@amazon.com>,
	"Brooker, Marc" <mbrooker@amazon.com>,
	"Weiss, Radu" <raduweis@amazon.com>,
	"Manwaring, Derek" <derekmn@amazon.com>
Subject: Re: [RFC]: mm,power: introduce MADV_WIPEONSUSPEND
Date: Tue, 7 Jul 2020 11:14:51 +0200	[thread overview]
Message-ID: <20200707091451.GB5913@dhcp22.suse.cz> (raw)
In-Reply-To: <efa55313-ce8a-bac9-15df-167f93c672b3@amazon.com>

On Tue 07-07-20 10:01:23, Alexander Graf wrote:
> On 07.07.20 09:44, Michal Hocko wrote:
> > On Mon 06-07-20 14:52:07, Jann Horn wrote:
> > > On Mon, Jul 6, 2020 at 2:27 PM Alexander Graf <graf@amazon.com> wrote:
> > > > Unless we create a vsyscall that returns both the PID as well as the
> > > > epoch and thus handles fork *and* suspend. I need to think about this a
> > > > bit more :).
> > > 
> > > You can't reliably detect forking by checking the PID if it is
> > > possible for multiple forks to be chained before the reuse check runs:
> > > 
> > >   - pid 1000 remembers its PID
> > >   - pid 1000 forks, creating child pid 1001
> > >   - pid 1000 exits and is waited on by init
> > >   - the pid allocator wraps around
> > >   - pid 1001 forks, creating child pid 1000
> > >   - child with pid 1000 tries to check for forking, determines that its
> > > PID is 1000, and concludes that it is still the original process
> > 
> > I must be really missing something here because I really fail to see why
> > there has to be something new even invented. Sure, checking for pid is
> > certainly a suboptimal solution because pids are terrible tokens to work
> > with. We do have a concept of file descriptors which a much better and
> > supports signaling. There is a clear source of the signal IIUC
> > (migration) and there are consumers to act upon that (e.g. crypto
> > backends). So what does really prevent to use a standard signal delivery
> > over fd for this usecase?
> 
> I wasn't part of the discussions on why things like WIPEONFORK were invented
> instead of just using signalling mechanisms, but the main reason I can think
> of are libraries.

Well, I would argue that WIPEONFORK is conceptually different. It is
one time initialization mechanism with a very clear life time semantic.
So any programming model is really as easy as, the initial state is
always 0 for a new task without any surprises later on because you own
the memory (essentially an extension to initialized .data section on
exec to any new task).

Compare that to a completely async nature of this interface. Any read
would essentially have to be properly synchronized with the external
event otherwise the state could have been corrupted. Such a consistency
model is really cumbersome to work with.

> As a library, you are under no control of the main loop usually, which means
> you just don't have a way to poll for an fd. As a library author, I would
> usually try to avoid very hard to create such a dependency, because it makes
> it really hard to glue pieces together.
> 
> The same applies to signals btw, which would also be a possible way to
> propagate such events.

Just to clarify I didn't really mean posix signals here. Those would be
quite clumsy indeed. But I can imagine that a library registers to a
system wide means to get a notification. There are many examples for
that, including a lot of usage inside libraries. All different *bus
interfaces.

-- 
Michal Hocko
SUSE Labs

WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
To: Alexander Graf <graf-vV1OtcyAfmbQT0dZR+AlfA@public.gmane.org>
Cc: Jann Horn <jannh-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Pavel Machek <pavel-+ZI9xUNit7I@public.gmane.org>,
	"Catangiu,
	Adrian Costin" <acatan-vV1OtcyAfmbQT0dZR+AlfA@public.gmane.org>,
	"linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org"
	<linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org>,
	"linux-pm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-pm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	"virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org"
	<virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>,
	"linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	"akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org"
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	"rjw-LthD3rsA81gm4RdzfppkhA@public.gmane.org"
	<rjw-LthD3rsA81gm4RdzfppkhA@public.gmane.org>,
	"len.brown-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org"
	<len.brown-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	"fweimer-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org"
	<fweimer-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	"keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org"
	<keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>,
	"luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org"
	<luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>,
	"wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org"
	<wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>,
	"mingo-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org"
	<mingo-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	"bonzini-mXXj517/zsQ@public.gmane.org"
	<bonzini-mXXj517/zsQ@public.gmane.org>
Subject: Re: [RFC]: mm,power: introduce MADV_WIPEONSUSPEND
Date: Tue, 7 Jul 2020 11:14:51 +0200	[thread overview]
Message-ID: <20200707091451.GB5913@dhcp22.suse.cz> (raw)
In-Reply-To: <efa55313-ce8a-bac9-15df-167f93c672b3-vV1OtcyAfmbQT0dZR+AlfA@public.gmane.org>

On Tue 07-07-20 10:01:23, Alexander Graf wrote:
> On 07.07.20 09:44, Michal Hocko wrote:
> > On Mon 06-07-20 14:52:07, Jann Horn wrote:
> > > On Mon, Jul 6, 2020 at 2:27 PM Alexander Graf <graf-vV1OtcyAfmbQT0dZR+AlfA@public.gmane.org> wrote:
> > > > Unless we create a vsyscall that returns both the PID as well as the
> > > > epoch and thus handles fork *and* suspend. I need to think about this a
> > > > bit more :).
> > > 
> > > You can't reliably detect forking by checking the PID if it is
> > > possible for multiple forks to be chained before the reuse check runs:
> > > 
> > >   - pid 1000 remembers its PID
> > >   - pid 1000 forks, creating child pid 1001
> > >   - pid 1000 exits and is waited on by init
> > >   - the pid allocator wraps around
> > >   - pid 1001 forks, creating child pid 1000
> > >   - child with pid 1000 tries to check for forking, determines that its
> > > PID is 1000, and concludes that it is still the original process
> > 
> > I must be really missing something here because I really fail to see why
> > there has to be something new even invented. Sure, checking for pid is
> > certainly a suboptimal solution because pids are terrible tokens to work
> > with. We do have a concept of file descriptors which a much better and
> > supports signaling. There is a clear source of the signal IIUC
> > (migration) and there are consumers to act upon that (e.g. crypto
> > backends). So what does really prevent to use a standard signal delivery
> > over fd for this usecase?
> 
> I wasn't part of the discussions on why things like WIPEONFORK were invented
> instead of just using signalling mechanisms, but the main reason I can think
> of are libraries.

Well, I would argue that WIPEONFORK is conceptually different. It is
one time initialization mechanism with a very clear life time semantic.
So any programming model is really as easy as, the initial state is
always 0 for a new task without any surprises later on because you own
the memory (essentially an extension to initialized .data section on
exec to any new task).

Compare that to a completely async nature of this interface. Any read
would essentially have to be properly synchronized with the external
event otherwise the state could have been corrupted. Such a consistency
model is really cumbersome to work with.

> As a library, you are under no control of the main loop usually, which means
> you just don't have a way to poll for an fd. As a library author, I would
> usually try to avoid very hard to create such a dependency, because it makes
> it really hard to glue pieces together.
> 
> The same applies to signals btw, which would also be a possible way to
> propagate such events.

Just to clarify I didn't really mean posix signals here. Those would be
quite clumsy indeed. But I can imagine that a library registers to a
system wide means to get a notification. There are many examples for
that, including a lot of usage inside libraries. All different *bus
interfaces.

-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2020-07-07  9:14 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-03 10:34 [RFC]: mm,power: introduce MADV_WIPEONSUSPEND Catangiu, Adrian Costin
2020-07-03 10:34 ` Catangiu, Adrian Costin
2020-07-03 10:34 ` Catangiu, Adrian Costin
2020-07-03 11:04 ` Jann Horn
2020-07-03 11:04   ` Jann Horn
2020-07-03 11:04   ` Jann Horn
2020-07-04  1:33   ` Colm MacCárthaigh
2020-07-04  1:33     ` Colm MacCárthaigh
2020-07-06 12:09   ` Alexander Graf
2020-07-06 12:09     ` Alexander Graf
2020-07-06 12:09     ` Alexander Graf
2020-07-03 11:30 ` Michal Hocko
2020-07-03 11:30   ` Michal Hocko
2020-07-03 11:30   ` Michal Hocko
2020-07-03 12:17   ` Rafael J. Wysocki
2020-07-03 12:17     ` Rafael J. Wysocki
2020-07-03 12:17     ` Rafael J. Wysocki
2020-07-03 22:39     ` Pavel Machek
2020-07-03 22:39       ` Pavel Machek
2020-07-03 22:39       ` Pavel Machek
2020-07-03 13:29   ` Jann Horn
2020-07-03 13:29     ` Jann Horn
2020-07-03 13:29     ` Jann Horn
2020-07-03 22:34     ` Pavel Machek
2020-07-03 22:34       ` Pavel Machek
2020-07-03 22:34       ` Pavel Machek
2020-07-03 22:53       ` Jann Horn
2020-07-03 22:53         ` Jann Horn
2020-07-03 22:53         ` Jann Horn
2020-07-07  7:38     ` Michal Hocko
2020-07-07  7:38       ` Michal Hocko
2020-07-07  7:38       ` Michal Hocko
2020-07-07  8:07       ` Pavel Machek
2020-07-07  8:07         ` Pavel Machek
2020-07-07  8:07         ` Pavel Machek
2020-07-07  8:58         ` Michal Hocko
2020-07-07  8:58           ` Michal Hocko
2020-07-07  8:58           ` Michal Hocko
2020-07-07 16:37           ` Pavel Machek
2020-07-07 16:37             ` Pavel Machek
2020-07-07 16:37             ` Pavel Machek
2020-07-07 19:00             ` Colm MacCarthaigh
2020-07-12  7:22               ` Pavel Machek
2020-07-12  7:22                 ` Pavel Machek
2020-07-13  8:02                 ` Michal Hocko
2020-07-13  8:02                   ` Michal Hocko
2020-07-04  1:45   ` Colm MacCárthaigh
2020-07-04  1:45     ` Colm MacCárthaigh
2020-07-07  7:40     ` Michal Hocko
2020-07-07  7:40       ` Michal Hocko
2020-07-03 22:44 ` Pavel Machek
2020-07-03 22:44   ` Pavel Machek
2020-07-03 22:44   ` Pavel Machek
2020-07-03 22:56   ` Jann Horn
2020-07-03 22:56     ` Jann Horn
2020-07-03 22:56     ` Jann Horn
2020-07-04 11:48     ` Pavel Machek
2020-07-04 11:48       ` Pavel Machek
2020-07-04 11:48       ` Pavel Machek
2020-07-06 12:26       ` Alexander Graf
2020-07-06 12:26         ` Alexander Graf
2020-07-06 12:26         ` Alexander Graf
2020-07-06 12:52         ` Jann Horn
2020-07-06 12:52           ` Jann Horn
2020-07-06 12:52           ` Jann Horn
2020-07-06 13:14           ` Alexander Graf
2020-07-06 13:14             ` Alexander Graf
2020-07-06 13:14             ` Alexander Graf
2020-07-07  7:44           ` Michal Hocko
2020-07-07  7:44             ` Michal Hocko
2020-07-07  7:44             ` Michal Hocko
2020-07-07  8:01             ` Alexander Graf
2020-07-07  8:01               ` Alexander Graf
2020-07-07  8:01               ` Alexander Graf
2020-07-07  9:14               ` Michal Hocko [this message]
2020-07-07  9:14                 ` Michal Hocko
2020-07-07  9:14                 ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200707091451.GB5913@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=acatan@amazon.com \
    --cc=akpm@linux-foundation.org \
    --cc=bonzini@gnu.org \
    --cc=colmmacc@amazon.com \
    --cc=derekmn@amazon.com \
    --cc=fweimer@redhat.com \
    --cc=graf@amazon.com \
    --cc=jannh@google.com \
    --cc=keescook@chromium.org \
    --cc=len.brown@intel.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=mbrooker@amazon.com \
    --cc=mingo@kernel.org \
    --cc=pavel@ucw.cz \
    --cc=raduweis@amazon.com \
    --cc=rjw@rjwysocki.net \
    --cc=sandreim@amazon.com \
    --cc=sblbir@amazon.com \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=wad@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.