All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC Patch v2 00/16] COarse-grain LOck-stepping Virtual Machines for Non-stop Service
@ 2013-07-11  8:35 Wen Congyang
  2013-07-11  8:35 ` [RFC Patch v2 01/16] xen: introduce new hypercall to reset vcpu Wen Congyang
                   ` (17 more replies)
  0 siblings, 18 replies; 30+ messages in thread
From: Wen Congyang @ 2013-07-11  8:35 UTC (permalink / raw)
  To: Dong Eddie, Lai Jiangshan, xen-devl, Shriram Rajagopalan
  Cc: Jiang Yunhong, Wen Congyang, Ye Wei, Xu Yao, Hong Tao

Virtual machine (VM) replication is a well known technique for providing
application-agnostic software-implemented hardware fault tolerance -
"non-stop service". Currently, remus provides this function, but it buffers
all output packets, and the latency is unacceptable.

In xen summit 2012, We introduce a new VM replication solution: colo
(COarse-grain LOck-stepping virtual machine). The presentation is in
the following URL:
http://www.slideshare.net/xen_com_mgr/colo-coarsegrain-lockstepping-virtual-machines-for-nonstop-service

Here is the summary of the solution:
>From the client's point of view, as long as the client observes identical
responses from the primary and secondary VMs, according to the service
semantics, then the secondary VM(SVM) is a valid replica of the primary
VM(PVM), and can successfully take over when a hardware failure of the
PVM is detected.

This patchset is RFC, and implements the frame of colo:
1. Both PVM and SVM are running
2. do checkpoint only when the output packets from PVM and SVM are different
3. cache write requests from SVM

ChangeLog from v1 to v2:
1. update block-remus to support colo
2. split large patch to small one
3. fix some bugs
4. add a new hypercall for colo

Changelog:
  Patch 1: optimize the dirty pages transfer speed.
  Patch 2-3: allow SVM running after checkpoint
  Patch 4-5: modification for colo on the master side(wait a new checkpoint,
             communicate with slaver when doing checkoint)
  Patch 6-7: implement colo's user interface


Wen Congyang (16):
  xen: introduce new hypercall to reset vcpu
  block-remus: introduce colo mode
  block-remus: introduce a interface to allow the user specify which
    mode the backup end uses
  dominfo.completeRestore() will be called more than once in colo mode
  xc_domain_restore: introduce restore_callbacks for colo
  colo: implement restore_callbacks init()/free()
  colo: implement restore_callbacks get_page()
  colo: implement restore_callbacks flush_memory
  colo: implement restore_callbacks update_p2m()
  colo: implement restore_callbacks finish_restore()
  xc_restore: implement for colo
  XendCheckpoint: implement colo
  xc_domain_save: flush cache before calling callbacks->postcopy()
  add callback to configure network for colo
  xc_domain_save: implement save_callbacks for colo
  remus: implement colo mode

 tools/blktap2/drivers/block-remus.c               |  188 ++++-
 tools/libxc/Makefile                              |    8 +-
 tools/libxc/xc_domain_restore.c                   |  264 ++++--
 tools/libxc/xc_domain_restore_colo.c              |  939 +++++++++++++++++++++
 tools/libxc/xc_domain_save.c                      |   23 +-
 tools/libxc/xc_save_restore_colo.h                |   14 +
 tools/libxc/xenguest.h                            |   51 ++
 tools/libxl/Makefile                              |    2 +-
 tools/python/xen/lowlevel/checkpoint/checkpoint.c |  322 +++++++-
 tools/python/xen/lowlevel/checkpoint/checkpoint.h |    1 +
 tools/python/xen/remus/device.py                  |    8 +
 tools/python/xen/remus/image.py                   |    8 +-
 tools/python/xen/remus/save.py                    |   13 +-
 tools/python/xen/xend/XendCheckpoint.py           |  127 ++-
 tools/python/xen/xend/XendDomainInfo.py           |   13 +-
 tools/remus/remus                                 |   28 +-
 tools/xcutils/Makefile                            |    4 +-
 tools/xcutils/xc_restore.c                        |   36 +-
 xen/arch/x86/domain.c                             |   57 ++
 xen/arch/x86/x86_64/entry.S                       |    4 +
 xen/include/public/xen.h                          |    1 +
 21 files changed, 1947 insertions(+), 164 deletions(-)
 create mode 100644 tools/libxc/xc_domain_restore_colo.c
 create mode 100644 tools/libxc/xc_save_restore_colo.h

-- 
1.7.4

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2013-08-06  6:47 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-11  8:35 [RFC Patch v2 00/16] COarse-grain LOck-stepping Virtual Machines for Non-stop Service Wen Congyang
2013-07-11  8:35 ` [RFC Patch v2 01/16] xen: introduce new hypercall to reset vcpu Wen Congyang
2013-07-11  9:44   ` Andrew Cooper
2013-07-11  9:58     ` Wen Congyang
2013-07-11 10:01       ` Ian Campbell
2013-08-01 11:48   ` Tim Deegan
2013-08-06  6:47     ` Wen Congyang
2013-07-11  8:35 ` [RFC Patch v2 02/16] block-remus: introduce colo mode Wen Congyang
2013-07-11  8:35 ` [RFC Patch v2 03/16] block-remus: introduce a interface to allow the user specify which mode the backup end uses Wen Congyang
2013-07-11  8:35 ` [RFC Patch v2 04/16] dominfo.completeRestore() will be called more than once in colo mode Wen Congyang
2013-07-11  8:35 ` [RFC Patch v2 05/16] xc_domain_restore: introduce restore_callbacks for colo Wen Congyang
2013-07-11  8:35 ` [RFC Patch v2 06/16] colo: implement restore_callbacks init()/free() Wen Congyang
2013-07-11  8:35 ` [RFC Patch v2 07/16] colo: implement restore_callbacks get_page() Wen Congyang
2013-07-11  8:35 ` [RFC Patch v2 08/16] colo: implement restore_callbacks flush_memory Wen Congyang
2013-07-11  8:35 ` [RFC Patch v2 09/16] colo: implement restore_callbacks update_p2m() Wen Congyang
2013-07-11  8:35 ` [RFC Patch v2 10/16] colo: implement restore_callbacks finish_restore() Wen Congyang
2013-07-11  9:40   ` Ian Campbell
2013-07-11  9:54     ` Wen Congyang
2013-07-11  8:35 ` [RFC Patch v2 11/16] xc_restore: implement for colo Wen Congyang
2013-07-11  8:35 ` [RFC Patch v2 12/16] XendCheckpoint: implement colo Wen Congyang
2013-07-11  8:35 ` [RFC Patch v2 13/16] xc_domain_save: flush cache before calling callbacks->postcopy() Wen Congyang
2013-07-11 13:43   ` Andrew Cooper
2013-07-12  1:36     ` Wen Congyang
2013-07-11  8:35 ` [RFC Patch v2 14/16] add callback to configure network for colo Wen Congyang
2013-07-11  8:35 ` [RFC Patch v2 15/16] xc_domain_save: implement save_callbacks " Wen Congyang
2013-07-11 13:52   ` Andrew Cooper
2013-07-11  8:35 ` [RFC Patch v2 16/16] remus: implement colo mode Wen Congyang
2013-07-11  9:37 ` [RFC Patch v2 00/16] COarse-grain LOck-stepping Virtual Machines for Non-stop Service Andrew Cooper
2013-07-11  9:40 ` Ian Campbell
2013-07-14 14:33   ` Shriram Rajagopalan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.