From: "Rafael J. Wysocki" <rjw@sisk.pl>
To: Kyle Moffett <mrmacman_g4@mac.com>
Cc: nigel@nigel.suspend2.net,
Linus Torvalds <torvalds@linux-foundation.org>,
Pekka J Enberg <penberg@cs.helsinki.fi>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: Back to the future.
Date: Sat, 28 Apr 2007 03:15:28 +0200 [thread overview]
Message-ID: <200704280315.29488.rjw@sisk.pl> (raw)
In-Reply-To: <35EFC5BA-D16B-41BE-A641-AEA8CCC9E0BE@mac.com>
On Saturday, 28 April 2007 03:03, Kyle Moffett wrote:
> On Apr 27, 2007, at 18:07:46, Nigel Cunningham wrote:
> > Hi.
> >
> > On Fri, 2007-04-27 at 14:44 -0700, Linus Torvalds wrote:
> >> It makes it harder to debug (wouldn't it be *nice* to just ssh in,
> >> and do
> >> gdb -p <snapshotter>
> >
> > Make the machine being suspended a VM and you can already do that.
>
> >> when something goes wrong?) but we also *depend* on user space for
> >> various things (the same way we depend on kernel threads, and why
> >> it has been such a total disaster to try to freeze the kernel
> >> threads too!). For example, if you want to do graphical stuff,
> >> just using X would be quite nice, wouldn't it?
> >
> > But in doing so you make the contents of the disk inconsistent with
> > the state you've just snapshotted, leading to filesystem
> > corruption. Even if you modify filesystems to do checkpointing
> > (which is what we're really talking about), you still also have the
> > problem that your snapshot has to be stored somewhere before you
> > write it to disk, so you also have to either [snip]
>
> Actually, it's a lot simpler than that. We can just combine the
> device-mapper snapshot with a VM+kernel snapshot system call and be
> almost done:
>
> sys_snapshot(dev_t snapblockdev, int __user *snapshotfd);
>
> When sys_snapshot is run, the kernel does:
>
> 1) Sequentially freeze mounted filesystems using blockdev freezing.
> If it's an fs that doesn't support freezing then either fail or force-
> remount-ro that fs and downgrade all its filedescriptors to RO.
> Doesn't need extra locking since process which try to do IO either
> succeed before the freeze call returns for that blockdev or sleep on
> the unfreeze of that blockdev. Filesystems are synchronized and made
> clean.
> 2) Iterate over the userspace process list, freezing each process
> and remapping all of its pages copy-on-write. Any device-specific
> pages need to have state saved by that device.
Why do you want to do 2) after 1) and not vice versa?
> 3) All processes (except kernel threads) are now frozen.
> 4) Kernel should save internal state corresponding to current
> userspace state. The kernel also swaps out excess pages to free up
> enough RAM and prepares the snapshot file-descriptor with copies of
> kernel memory and the original (pre-COW) mapped userspace pages.
> 5) Kernel substitutes filesystems for either a device-mapper
> snapshot with snapblockdev as backing storage or union with tmpfs and
> remounts the underlying filesystems as read-only.
> 6) Kernel unfreezes all userspace processes and returns the snapshot
> FD to userspace (where it can be read from).
Okay, but how do we do the error recovery if, for example, the image cannot
be saved?
> Then userspace can do whatever it wants. Any changes to filesystems
> mounted at the time of snapshot will be discarded at shutdown.
> Freshly mounted filesystems won't have the union or COW thing done,
> and so you can write your snapshot to a compressed encrypted file on
> a USB key if you want to, you just have to unmount it before the
> snapshot() syscall and remount it right afterwards.
This seems to be a good idea.
Greetings,
Rafael
next prev parent reply other threads:[~2007-04-28 1:11 UTC|newest]
Thread overview: 136+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-04-26 6:04 Back to the future Nigel Cunningham
2007-04-26 7:28 ` Pekka Enberg
[not found] ` <1177573348.50 25.224.camel@nigel.suspend2.net>
2007-04-26 7:42 ` Nigel Cunningham
2007-04-26 8:17 ` Pekka Enberg
2007-04-26 9:28 ` Nigel Cunningham
2007-04-26 17:29 ` Luca Tettamanti
2007-04-26 16:56 ` Linus Torvalds
2007-04-26 17:03 ` Xavier Bestel
2007-04-26 17:34 ` Linus Torvalds
2007-04-26 20:08 ` Nigel Cunningham
2007-04-26 20:45 ` Linus Torvalds
2007-04-26 20:50 ` Nigel Cunningham
2007-04-27 0:10 ` Olivier Galibert
2007-04-27 10:21 ` Daniel Pittman
2007-04-27 23:19 ` Nigel Cunningham
2007-04-26 21:38 ` Theodore Tso
2007-04-27 10:10 ` Christoph Hellwig
2007-04-26 22:08 ` Rafael J. Wysocki
2007-04-26 22:20 ` Nigel Cunningham
2007-04-26 23:15 ` Linus Torvalds
2007-04-27 7:51 ` Pekka Enberg
2007-04-26 17:07 ` Linus Torvalds
2007-04-26 18:22 ` Chase Venters
2007-04-26 18:50 ` David Lang
2007-04-26 19:56 ` Nigel Cunningham
2007-04-27 4:52 ` Pekka J Enberg
2007-04-27 6:08 ` Nigel Cunningham
2007-04-27 6:18 ` Pekka J Enberg
2007-04-27 6:29 ` Pekka J Enberg
2007-04-27 6:34 ` Nigel Cunningham
2007-04-27 6:50 ` Pekka J Enberg
2007-04-27 7:03 ` Nigel Cunningham
2007-04-27 7:24 ` Pekka J Enberg
2007-04-27 9:50 ` Oliver Neukum
2007-04-27 10:12 ` Pekka J Enberg
2007-04-27 19:07 ` Oliver Neukum
2007-04-28 9:22 ` Pekka Enberg
2007-04-28 13:37 ` Oliver Neukum
2007-05-03 12:06 ` Pavel Machek
2007-05-04 21:52 ` Indan Zupancic
2007-05-05 9:16 ` Pavel Machek
2007-05-05 12:02 ` Indan Zupancic
2007-04-28 10:35 ` Rafael J. Wysocki
2007-04-28 18:43 ` David Lang
2007-04-28 19:37 ` Rafael J. Wysocki
2007-04-27 21:24 ` Rafael J. Wysocki
2007-04-27 21:44 ` Linus Torvalds
2007-04-27 22:04 ` Rafael J. Wysocki
2007-04-27 22:08 ` Linus Torvalds
2007-04-27 22:41 ` Rafael J. Wysocki
2007-04-27 22:26 ` David Lang
2007-04-27 23:21 ` Rafael J. Wysocki
2007-04-27 23:01 ` David Lang
2007-04-28 0:02 ` Rafael J. Wysocki
2007-04-27 23:17 ` Linus Torvalds
2007-04-27 23:45 ` Rafael J. Wysocki
2007-04-27 23:57 ` Nigel Cunningham
2007-04-27 23:50 ` David Lang
2007-04-28 0:40 ` Linus Torvalds
2007-04-28 6:58 ` Oliver Neukum
2007-04-28 9:16 ` Pekka J Enberg
2007-04-28 18:28 ` David Lang
2007-05-03 17:18 ` Pavel Machek
2007-05-07 2:13 ` David Lang
2007-05-07 3:33 ` Kyle Moffett
2007-05-07 12:48 ` Pavel Machek
2007-05-07 12:52 ` Oliver Neukum
2007-05-07 14:37 ` david
2007-05-07 19:51 ` Pavel Machek
2007-05-07 19:55 ` david
2007-05-07 20:38 ` Pavel Machek
2007-05-08 17:36 ` Disconnect
2007-04-27 23:59 ` Linus Torvalds
2007-04-28 0:18 ` Linus Torvalds
2007-05-05 11:42 ` Pavel Machek
2007-04-28 0:50 ` Paul Mackerras
2007-04-28 1:00 ` Rafael J. Wysocki
2007-04-28 1:12 ` Linus Torvalds
2007-04-28 0:54 ` David Lang
2007-04-28 1:44 ` Rafael J. Wysocki
2007-04-28 2:51 ` Daniel Hazelton
2007-04-28 7:00 ` progress meter in s2disk (was Re: Back to the future.) Pavel Machek
2007-04-28 8:50 ` Back to the future Pavel Machek
2007-04-28 9:24 ` Rafael J. Wysocki
2007-04-28 16:28 ` Linus Torvalds
2007-04-28 17:50 ` Rafael J. Wysocki
2007-04-28 21:25 ` Linus Torvalds
2007-04-28 23:03 ` Rafael J. Wysocki
2007-04-28 23:45 ` Linus Torvalds
2007-04-29 0:01 ` Nigel Cunningham
2007-04-29 5:01 ` Bojan Smojver
2007-04-29 3:43 ` Kyle Moffett
2007-04-29 8:57 ` Rafael J. Wysocki
2007-04-29 8:59 ` Pavel Machek
2007-04-29 9:32 ` Rafael J. Wysocki
2007-04-29 8:23 ` Pavel Machek
2007-04-29 9:22 ` Rafael J. Wysocki
2007-04-28 18:32 ` David Lang
2007-04-28 19:14 ` Rafael J. Wysocki
2007-04-28 18:44 ` David Lang
2007-05-03 15:25 ` Pavel Machek
2007-04-27 22:07 ` Nigel Cunningham
2007-04-28 1:03 ` Kyle Moffett
2007-04-28 1:15 ` Rafael J. Wysocki [this message]
2007-04-28 0:51 ` David Lang
2007-04-28 1:25 ` Kyle Moffett
2007-05-03 15:10 ` Pavel Machek
2007-05-03 16:53 ` Kyle Moffett
2007-05-04 7:52 ` David Greaves
2007-05-04 13:27 ` Kyle Moffett
2007-04-28 0:18 ` Jeremy Fitzhardinge
2007-04-28 1:00 ` Matthew Garrett
2007-04-28 1:05 ` Jeremy Fitzhardinge
2007-05-03 15:14 ` Pavel Machek
2007-06-01 19:00 ` Eric W. Biederman
2007-04-28 1:08 ` Rafael J. Wysocki
2007-04-27 20:44 ` Rafael J. Wysocki
2007-04-28 19:09 ` Bill Davidsen
2007-04-26 22:40 ` Pavel Machek
2007-04-27 5:41 ` Pekka Enberg
2007-04-27 14:55 ` Pavel Machek
2007-04-27 21:39 ` Nigel Cunningham
2007-04-26 22:42 ` Pavel Machek
2007-04-26 22:24 ` David Lang
2007-04-26 23:12 ` Pavel Machek
2007-04-26 22:49 ` David Lang
2007-04-26 23:27 ` Pavel Machek
2007-04-26 22:56 ` David Lang
2007-04-27 0:23 ` Olivier Galibert
2007-04-27 12:49 ` Pavel Machek
2007-04-27 21:26 ` Rafael J. Wysocki
2007-04-27 22:12 ` David Lang
2007-04-26 8:38 ` Jan Engelhardt
2007-04-26 9:33 ` Nigel Cunningham
2007-04-28 0:28 ` Bojan Smojver
[not found] <8e5l8-7SD-21@gated-at.bofh.it>
[not found] ` <8e6Ka-1uR-3@gated-at.bofh.it>
[not found] ` <8e6TS-1Id-11@gated-at.bofh.it>
[not found] ` <8efu9-6mF-1@gated-at.bofh.it>
[not found] ` <8ekWV-6FF-33@gated-at.bofh.it>
[not found] ` <8el6y-6Sj-5@gated-at.bofh.it>
[not found] ` <8elpT-7wY-21@gated-at.bofh.it>
2007-04-28 11:04 ` Bodo Eggert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200704280315.29488.rjw@sisk.pl \
--to=rjw@sisk.pl \
--cc=linux-kernel@vger.kernel.org \
--cc=mrmacman_g4@mac.com \
--cc=nigel@nigel.suspend2.net \
--cc=penberg@cs.helsinki.fi \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).