From: Matt Helsley <matthltc@us.ibm.com>
To: Oren Laadan <orenl@cs.columbia.edu>
Cc: Alexey Dobriyan <adobriyan@gmail.com>,
containers@lists.linux-foundation.org, akpm@linux-foundation.org,
xemul@parallels.com, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 18/38] C/R: core stuff
Date: Thu, 28 May 2009 15:33:34 -0700 [thread overview]
Message-ID: <20090528223334.GB17402@us.ibm.com> (raw)
In-Reply-To: <4A1F0E29.2040506@cs.columbia.edu>
On Thu, May 28, 2009 at 06:20:25PM -0400, Oren Laadan wrote:
>
>
> Alexey Dobriyan wrote:
> > On Wed, May 27, 2009 at 06:45:04PM -0400, Oren Laadan wrote:
> >> Alexey Dobriyan wrote:
> >>> On Wed, May 27, 2009 at 04:56:27PM -0400, Oren Laadan wrote:
> >>>> Alexey Dobriyan wrote:
> >>>>> On Tue, May 26, 2009 at 08:16:44AM -0500, Serge E. Hallyn wrote:
> >>>>>> Quoting Alexey Dobriyan (adobriyan@gmail.com):
> >>>>>>> Introduction
> >>>>>>> ------------
> >>>>>>> Checkpoint/restart (C/R from now) allows to dump group of processes to disk
> >>>>>>> for various reasons like saving process state in case of box failure or
> >>>>>>> restoration of group of processes on another or same machine later.
> >>>>>>>
> >>>>>>> Unlike, let's say, hypervisor C/R style which only needs to freeze guest kernel
> >>>>>>> and dump more or less raw pages, proposed C/R doesn't require hypervisor.
> >>>>>>> For that C/R code needs to know about all little and big intimate kernel details.
> >>>>>>>
> >>>>>>> The good thing is that not all details needs to be serialized and saved
> >>>>>>> like, say, readahead state. The bad things is still quite a few things
> >>>>>>> need to be.
> >>>>>> Hi Alexey,
> >>>>>>
> >>>>>> the last time you posted this, I went through and tried to discern the
> >>>>>> meaningful differences between yours and Oren's patchsets. Then I sent some
> >>>>>> patches to Oren to make his set configurable to act more like yours. And Oren
> >>>>>> took them! But now you resend this patchset with no real changelog, no
> >>>>>> acknowledgment that Oren's set even exists
> >>>>> Is this a requirement? Everybody following topic already knows about
> >>>>> Oren's patchset.
> >>>> Some people do ack other people's work. See for example patches #1
> >>>> and #24 in my recent post. You're welcome.
> >>>>
> >>>>>> - or is much farther along and pretty widely reviewed and tested (which is
> >>>>>> only because he started earlier and, when we asked for your counterpatches
> >>>>>> at an earlier stage, you would never reply) - or, most importantly, what
> >>>>>> it is that you think your patchset does that his does not and cannot.
> >>>>> There are differences. And they're not small like you're trying to describe
> >>>>> but pretty big compared the scale of the problem.
> >>>> I've asked before, and I repeat now: can you enumerate these "big"
> >>>> scary differences that make it such a "big" problem ?
> >>>>
> >>>> So far, we identified two main "design" issues -
> >>> Why in "? Yes, they are high-level design issues.
> >>>
> >> In quotes, because I argued further on that, although my patchset
> >> takes a stand on both issues, it can be easily reverted _within_
> >> that patchset. Moreover, I argue that they can co-exist.
> >>
> >>>> 1) Whether or not allow c/r of sub-container (partial hierarchy)
> >>>>
> >>>> 2) Creation of restarting process hierarchy in kernel or in userspace
> >>>>
> >>>> As for #1, you are the _only_ one who advocates restricting c/r to
> >>>> a full container only. I guess you have your reasons, but I'm unsure
> >>>> what they may be.
> >>> The reason is that checkpointing half-frozen, half-live container is
> >>> essentially equivalent to live container which adds much complexity
> >>> to code fundamentally preventing kernel from taking coherent snapshot.
> >>>
> >>> In such situations kernel will do its job badly.
> >> In such situation the kernel will do a bad job if the user is asking
> >> for a bad job.
> >
> > User doesn't even understand why we're discussing this issue so hard.
> >
> >> Just like checkpointing without snapshotting the file system and expecting
> >> it to always work.
> >
> > This is different.
> >
> > Kernel can't do anything about not-synced fs. Because nodoby is
> > advocating that kernel should sync fs. Consequently, screwup in fs sync is
> > clearly user failure. Any (yours, mine) in-kernel C/R has this failure mode,
> > so we skip it and discuss what's left.
> >
> > Now, kernel CAN do something about tasks and other data structures
> > because it easily controls them.
> >
> > Your procedure for checkpointing starts with "kill -STOP".
>
> Wrong. It requires the processes to be frozen.
>
> > To make anything reliable, you have to ban "kill -CONT" for the duration of
> > checkpointing. Is this done BTW? I don't remember new flags added
> > in task_struct. Or this is going to be skipped on grounds that it's
> > user screwup (potentially oopsable).
> >
> > That's why, OpenVZ relies on suspend-to-ram freezer solely, because userspace
> > can't arbitrarily send suspend and freeze notifications. We only need to
> > protect against untimely STR unfreeze which only adds code in C/R code
> > not in task_struct.
>
> Same principle for both patchsets: tasks may *not* be permitted to
> execute while being checkpointed.
>
> For this I suggested a CHECKPOINTING freezer state: transition to/from
> this state is done _only_ by sys_checkpoint(), so that checkpointed
> processes cannot be unfrozen. Matt Helseley already posted a patch to
> implement this.
In case it helps, here's the patch and some feedback Oren gave me:
https://lists.linux-foundation.org/pipermail/containers/2009-May/017586.html
Cheers,
-Matt Helsley
next prev parent reply other threads:[~2009-05-28 22:33 UTC|newest]
Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-05-22 4:54 [PATCH 01/38] cred: #include init.h in cred.h Alexey Dobriyan
2009-05-22 4:54 ` [PATCH 02/38] utsns: extract create_uts_ns() Alexey Dobriyan
2009-05-24 22:37 ` Serge E. Hallyn
2009-05-22 4:54 ` [PATCH 03/38] ipcns 1/4: remove useless get/put while CLONE_NEWIPC Alexey Dobriyan
2009-05-22 9:00 ` Amerigo Wang
2009-05-22 4:54 ` [PATCH 04/38] ipcns 2/4: extract create_ipc_ns() Alexey Dobriyan
2009-05-22 8:59 ` Amerigo Wang
2009-05-22 4:54 ` [PATCH 05/38] ipcns 3/4: make free_ipc_ns() static Alexey Dobriyan
2009-05-24 22:40 ` Serge E. Hallyn
2009-05-22 4:55 ` [PATCH 06/38] ipcns 4/2: move free_ipcs() proto Alexey Dobriyan
2009-05-24 22:49 ` Serge E. Hallyn
2009-05-22 4:55 ` [PATCH 07/38] pidns 1/2: make create_pid_namespace() accept parent pidns Alexey Dobriyan
2009-05-22 9:20 ` Amerigo Wang
2009-05-24 22:44 ` Serge E. Hallyn
2009-06-04 0:20 ` Sukadev Bhattiprolu
2009-05-22 4:55 ` [PATCH 08/38] pidns 2/2: rewrite copy_pid_ns() Alexey Dobriyan
2009-05-22 9:14 ` Amerigo Wang
2009-05-24 22:45 ` Serge E. Hallyn
2009-06-04 0:17 ` Sukadev Bhattiprolu
2009-05-22 4:55 ` [PATCH 09/38] netns 1/2: don't get/put old netns on CLONE_NEWNET Alexey Dobriyan
2009-05-22 6:30 ` David Miller
2009-05-22 4:55 ` [PATCH 10/38] netns 2/2: extract net_create() Alexey Dobriyan
2009-05-22 6:30 ` David Miller
2009-05-22 4:55 ` [PATCH 11/38] nsproxy: extract create_nsproxy() Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 12/38] i386: ifdef out struct thread_struct::fs Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 13/38] x86_64: ifdef out struct thread_struct::ip Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 14/38] Remove struct mm_struct::exe_file et al Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 15/38] dcache: extract and use d_unlinked() Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 16/38] x86: ptrace debugreg checks rewrite Alexey Dobriyan
2009-05-26 23:25 ` Andrew Morton
2009-05-22 4:55 ` [PATCH 17/38] groups: move code to kernel/groups.c Alexey Dobriyan
2009-05-25 0:53 ` Serge E. Hallyn
2009-05-26 14:48 ` Serge E. Hallyn
2009-05-26 18:34 ` Alexey Dobriyan
2009-05-26 23:25 ` Serge E. Hallyn
2009-05-22 4:55 ` [PATCH 18/38] C/R: core stuff Alexey Dobriyan
2009-05-26 13:16 ` Serge E. Hallyn
2009-05-26 19:35 ` Alexey Dobriyan
2009-05-26 23:14 ` Serge E. Hallyn
2009-05-26 23:44 ` Serge E. Hallyn
2009-05-28 15:38 ` Alexey Dobriyan
2009-05-28 18:17 ` Serge E. Hallyn
2009-05-28 22:42 ` Oren Laadan
2009-05-27 18:52 ` Dave Hansen
2009-05-27 20:56 ` Oren Laadan
2009-05-27 22:17 ` Alexey Dobriyan
2009-05-27 22:40 ` Andrew Morton
2009-05-27 22:45 ` Oren Laadan
2009-05-28 15:33 ` Alexey Dobriyan
2009-05-28 22:20 ` Oren Laadan
2009-05-28 22:33 ` Matt Helsley [this message]
2009-05-29 6:01 ` Alexey Dobriyan
2009-05-29 17:26 ` Dave Hansen
2009-05-27 22:25 ` Alexey Dobriyan
2009-05-27 16:28 ` Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 19/38] C/R: multiple tasks Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 20/38] C/R: i386 support Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 21/38] C/R: i386 debug registers Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 22/38] C/R: i386 xstate Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 23/38] C/R: x86_64 support Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 24/38] C/R: x86_64 debug registers Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 25/38] C/R: x86_64 xstate Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 26/38] C/R: nsproxy Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 27/38] C/R: checkpoint/restore struct uts_namespace Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 28/38] C/R: formally checkpoint/restore struct ipc_namespace Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 29/38] C/R: formally checkpoint/restore struct mnt_namespace Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 30/38] C/R: checkpoint/restore struct pid_namespace Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 31/38] C/R: formally checkpoint/restore struct net_namespace Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 32/38] C/R: checkpoint/restore struct cred Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 33/38] C/R: checkpoint/restore aux groups (structy group_info) Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 34/38] C/R: checkpoint/restore struct user Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 35/38] C/R: checkpoint/restore struct user_namespace Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 36/38] C/R: checkpoint/restore struct pid Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 37/38] C/R: checkpoint/restore opened files Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 38/38] C/R: checkpoint/restart struct sighand_struct Alexey Dobriyan
2009-05-22 5:02 ` [PATCH 01/38] cred: #include init.h in cred.h Alexey Dobriyan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090528223334.GB17402@us.ibm.com \
--to=matthltc@us.ibm.com \
--cc=adobriyan@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=containers@lists.linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=orenl@cs.columbia.edu \
--cc=xemul@parallels.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).