linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexey Dobriyan <adobriyan@gmail.com>
To: Oren Laadan <orenl@cs.columbia.edu>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>,
	akpm@linux-foundation.org, containers@lists.linux-foundation.org,
	xemul@parallels.com, serue@us.ibm.com, mingo@elte.hu,
	hch@infradead.org, torvalds@linux-foundation.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 00/30] C/R OpenVZ/Virtuozzo style
Date: Tue, 14 Apr 2009 22:34:35 +0400	[thread overview]
Message-ID: <20090414183435.GA28233@x200.localdomain> (raw)
In-Reply-To: <49E4D115.5080601@cs.columbia.edu>

On Tue, Apr 14, 2009 at 02:08:21PM -0400, Oren Laadan wrote:
> 
> 
> Alexey Dobriyan wrote:
> > On Tue, Apr 14, 2009 at 12:26:50AM -0400, Oren Laadan wrote:
> >> Alexey Dobriyan wrote:
> >>> On Thu, Apr 09, 2009 at 10:07:11PM -0700, Dave Hansen wrote:
> >>>> I'm curious how you see these fitting in with the work that we've been
> >>>> doing with Oren.  Do you mean to just start a discussion or are you
> >>>> really proposing these as an alternative to what Oren has been posting?
> >>> Yes, this is posted as alternative.
> >>>
> >>> Some design decisions are seen as incorrect from here like:
> >> A definition of "design" would help; I find most of your comments
> >> below either vague, cryptic, or technical nits...
> >>
> >>> * not rejecting checkpoint with possible "leaks" from container
> >> ...like this, for example.
> > 
> > Like checkpointing one process out of many living together.
> 
> See the thread on creating tasks in userspace vs. kernel space:
> the argument here is that is an interesting enough use case for
> a checkpoint of not-an-entire-container.
> 
> Of course it will require more logic to it, so the user can choose
> what she cares or does not care about, and the kernel could alert
> the user about it.
> 
> The point is, that it is, IMHO, a desirable capability.
> 
> > 
> > If you allow this you consequently drop checks (e.g. refcount checks)
> > for "somebody else is using structure to be checkpointed".
> > 
> 
> From this point below, I totally agree with you that for the purpose
> of a whole-container-checkpoint this is certainly desirable. My point
> was that it can be easily added the existing patchset (not yours).
> Why not add it there ?
> 
> > If you drop these checks, you can't decipher legal sutiations like
> > "process genuinely doesn't care about routing table of netns it lives in"
> > from "illegal" situations like "process created shm segment but currently
> > doesn't use it so not checkpointing ipcns will result in breakagenlater".
> > 
> > You'll have to move responsibility to user, so user exactly knows what
> > app relies on and on what. And probably add flags like CKPT_SHM,
> > CKPT_NETNS_ROUTE ad infinitum.
> > 
> > And user will screw it badly and complain: "after restart my app
> > segfaulted". And user himself is screwed now: old running process is
> > already killed (it was checkpointed on purpose) and new process in image
> > segfaults every time it's restarted.
> > 
> > All of this in out opinion results in doing C/R unreliably and badly.
> > 
> > We are going to do it well and dig from the other side.
> > 
> > If "leak" (any "leak") is detected, C/R is aborted because kernel
> > doesn't know what app relies on and what app doesn't care about.
> > 
> > This protected from situations and failure modes described above.
> > 
> > This also protects to some extent from in-kernel changes where C/R code
> > should have been updated but wasn't. Person doing incomplete change won't
> > notice e.g refcount checks and won't try to "fix" them. But we'll notice it,
> > e.g. when running testsuite (amen) and update C/R code accordingly.
> > 
> > I'm talking about these checks so that everyone understands:
> > 
> > 	for_each_cr_object(ctx, obj, CR_CTX_MM_STRUCT) {
> >                 struct mm_struct *mm = obj->o_obj;
> >                 unsigned int cnt = atomic_read(&mm->mm_users);
> > 
> >                 if (obj->o_count != cnt) {
> >                         printk("%s: mm_struct %p has external references %lu:%u\n", __func__, mm, obj->o_count, cnt);
> >                         return -EINVAL;
> >                 }
> >         }
> > 
> > They are like moving detectors, small, invisible, something moved, you don't
> > know what, but you don't care because you have to investigate anyway.
> > 
> > In this scheme, if user wants to checkpoint just one process, he should
> > start it alone in separate container. Right now, in posted patchset
> > as cloned process with
> > CLONE_NEWNS|CLONE_NEWUTS|CLONE_NEWIPC|CLONE_NEWUSER|CLONE_NEWPID|CLONE_NEWNET
> 
> So you suggest that to checkpoint a single process, say a cpu job that
> would run a week, which runs in the topmost pid_ns, I will need to
> checkpoint the entire topmost pid_ns (as a container, if at all possible
> - surely there will non-checkpointable tasks there) and then in
> user-space filter out the data and leave only one task, and then to
> restart I'll use a container again ?

No, you do little preparations and start CPU job in container from the very
beginning.

> Pffff ... why not just allow subtree checkpoint, not in a container,
> with its well known limitations -- would work the same, for very little
> additional implementation cost.

  reply	other threads:[~2009-04-14 18:36 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-10  2:32 [PATCH 00/30] C/R OpenVZ/Virtuozzo style Alexey Dobriyan
2009-04-10  2:44 ` Alexey Dobriyan
2009-04-10  5:07 ` Dave Hansen
2009-04-13  9:14   ` Alexey Dobriyan
2009-04-13 11:16     ` Dave Hansen
2009-04-13 18:07     ` Dave Hansen
2009-04-14  4:26     ` Oren Laadan
2009-04-14 14:58       ` Alexey Dobriyan
2009-04-14 18:08         ` Oren Laadan
2009-04-14 18:34           ` Alexey Dobriyan [this message]
2009-04-14 19:31             ` Oren Laadan
2009-04-14 20:08               ` Alexey Dobriyan
2009-04-14 20:49           ` Alexey Dobriyan
2009-04-14 21:11             ` Dave Hansen
2009-04-14 21:39             ` Serge E. Hallyn
2009-04-15 19:21               ` CAP_SYS_ADMIN on restart(2) (was: Re: [PATCH 00/30] C/R OpenVZ/Virtuozzo style) Alexey Dobriyan
2009-04-15 20:22                 ` Serge E. Hallyn
2009-04-15 20:23                 ` Dave Hansen
2009-04-15 20:39                   ` Serge E. Hallyn
2009-04-15 21:05                     ` CAP_SYS_ADMIN on restart(2) Oren Laadan
2009-04-15 21:16                       ` Serge E. Hallyn
2009-04-16 15:35                         ` Alexey Dobriyan
2009-04-16 16:29                           ` Serge E. Hallyn
2009-04-10  8:28 ` [PATCH 00/30] C/R OpenVZ/Virtuozzo style Ingo Molnar
2009-04-10 11:45   ` Alexey Dobriyan
2009-04-10 15:06 ` Linus Torvalds
2009-04-13  7:39   ` Alexey Dobriyan
2009-04-13 18:39     ` Linus Torvalds
2009-04-13 19:30       ` Ingo Molnar
2009-04-14 12:29       ` Alexey Dobriyan
2009-04-14 13:44         ` Ingo Molnar
2009-04-14 16:53           ` Alexey Dobriyan
2009-04-14 17:09         ` Linus Torvalds
2009-04-14 17:19           ` Randy Dunlap
2009-04-14 17:32             ` Linus Torvalds
2009-04-14  5:46 ` Oren Laadan
2009-04-14 15:19   ` Alexey Dobriyan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090414183435.GA28233@x200.localdomain \
    --to=adobriyan@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=containers@lists.linux-foundation.org \
    --cc=dave@linux.vnet.ibm.com \
    --cc=hch@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=orenl@cs.columbia.edu \
    --cc=serue@us.ibm.com \
    --cc=torvalds@linux-foundation.org \
    --cc=xemul@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).