Re: [Ksummit-2010-discuss] checkpoint-restart: naked patch

* Re: [Ksummit-2010-discuss] checkpoint-restart: naked patch
       [not found] <Pine.LNX.4.64.1011021530470.12128@takamine.ncl.cs.columbia.edu>
@ 2010-11-02 21:35 ` Tejun Heo
  2010-11-02 21:47   ` Christoph Hellwig
                     ` (2 more replies)
  2010-11-08 16:55 ` Grant Likely
  1 sibling, 3 replies; 111+ messages in thread
From: Tejun Heo @ 2010-11-02 21:35 UTC (permalink / raw)
  To: Oren Laadan; +Cc: ksummit-2010-discuss, linux-kernel

(cc'ing lkml too)
Hello,

On 11/02/2010 08:30 PM, Oren Laadan wrote:
> Following the discussion yesterday, here is a linux-cr diff that
> that is limited to changes to existing code.
> 
> The diff doesn't include the eclone() patches. I also tried to strip
> off the new c/r code (either code in new files, or new code within
> #ifdef CONFIG_CHECKPOINT in existing files).
> 
> I left a few such snippets in, e.g. c/r syscalls templates and 
> declaration of c/r specific methods in, e.g. file_operations.
> 
> The remaining changes in this patch include new freezer state
> ("CHECKPOINTING"), mostly refactoring of exsiting code, and a bit
> of new helpers.
> 
> Disclaimer: don't try to compile (or apply) - this is only intended
> to give a ballpark of how the c/r patches change existing code.

The patch size itself isn't too big but I still think it's one scary
patch mostly because the breadth of the code checkpointing needs to
modify and I suspect that probably is the biggest concern regarding
checkpoint-restart from implementation point of view.

FWIW, I'm not quite convinced checkpoint-restart can be something
which can be generally useful.  In controlled environments where the
target application behavior can be relatively well defined and
contained (including actions necessary to rollback in case something
goes bonkers), it would work and can be quite useful, but I'm afraid
the states which need to be saved and restored aren't defined well
enough to be generally applicable.  Not only is it a difficult
problem, it actually is impossible to define common set of states to
be saved and restored - it depends on each application.

As such, I have difficult time believing it can be something generally
useful.  IOW, I think talking about its usage in complex environments
like common desktops is mostly handwaving.  What about X sessions,
network connections, states established in other applications via dbus
or whatnot?  Which files need to be snapshotted together?  What about
shared mmaps?  These questions are not difficult to answer in generic
way, they are impossible.

There is a very distinctive difference between system wide
suspend/hibernation and process checkpointing.  Most programs are
already written with the conditions in mind which can be caused by
system level suspend/hibernation.  Most programs don't expect to be
scheduled and run in any definite amount of time.  There usually
are provisions for loss or failure of resources which are out of the
local system.  There are corner cases which are affected and those
programs contain code to respond to suspend/hibernation.  Please note
that this is about userland application behavior but not
implementation detail in the kernel.  It is a much more fundamental
property.

So, although checkpoint-restart can be very useful for certain
circumstances, I don't believe there can be a general implementation.
It inevitably needs to put somewhat strict restrictions on what the
applications being checkpointed are allowed to do.  And after my
train of thought reaches there, I fail to see what the advantages of
in-kernel implementation would be compared to something like the
following.

  http://dmtcp.sourceforge.net/

Sure, in-kernel implementation would be able to fake it better, but I
don't think it's anything major.  The coverage would be slightly
better but breaking the illusion wouldn't take much.  Just push it a
bit further and it will break all the same.  In addition, to be
useful, it would need userland framework or set of workarounds which
are aware of and can manipulate userland states anyway.  For workloads
for which checkpointing would be most beneficial (HPC for example), I
think something like the above would do just fine and it would make
much more sense to add small features to make userland checkpointing
work better than doing the whole thing in the kernel.

I think in-kernel checkpointing is in awkward place in terms of
tradeoff between its benefits and the added complexities to implement
it.  If you give up coverage slightly, userland checkpointing is
there.  If you need reliable coverage, proper virtualization isn't too
far away.  As such, FWIW, I fail to see enough justification for the
added complexity.  I'll be happy to be proven wrong tho.  :-)

Thank you.

-- 
tejun

^ permalink raw reply	[flat|nested] 111+ messages in thread