[Qemu-devel] [PATCH COLO-Frame v8 00/34] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT)

* [Qemu-devel] [PATCH COLO-Frame v8 00/34] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT)
@ 2015-07-29  8:45 zhanghailiang
  2015-07-29  8:45 ` [Qemu-devel] [PATCH COLO-Frame v8 01/34] configure: Add parameter for configure to enable/disable COLO support zhanghailiang
                   ` (35 more replies)
  0 siblings, 36 replies; 73+ messages in thread
From: zhanghailiang @ 2015-07-29  8:45 UTC (permalink / raw)
  To: qemu-devel
  Cc: lizhijian, quintela, yunhong.jiang, eddie.dong, peter.huangpeng,
	dgilbert, arei.gonglei, amit.shah, zhanghailiang

This is the 8th version of COLO.

Here is only COLO frame part, include: VM checkpoint,
failover, proxy API, block replication API, not include block replication.
The block part is treated as a separate series.

As usual, we provide 'basic' and 'developing' branches in github:
https://github.com/coloft/qemu/commits/colo-v1.5-basic
https://github.com/coloft/qemu/commits/colo-v1.5-developing (more features)

The 'basic' branch is exactly the same with this patch series,
We will keep this series simple as possible, just for easy review.

The extra features in colo-v1.5-developing branch:
1) Separate ram and device save/load process to reduce size of extra memory
used during checkpoint
2) Live migrate part of dirty pages to slave during sleep time.
3) You get the statistic info about checkpoint by command 'info migrate'

Please reference to the follow link to test COLO.
http://wiki.qemu.org/Features/COLO.

COLO is a totally new feature which is still in early stage,
your comments and feedback are warmly welcomed.

NOTE:
We have decided to re-implement the colo proxy in userspace (In qemu exactly).
you can find the discussion about why & how to realize the colo proxy in qemu from the follow link:
http://lists.nongnu.org/archive/html/qemu-devel/2015-07/msg04069.html

TODO:
1. COLO function switch on/off
2. The capability of continuous FT
3. Optimize the performance.

v8:
- Move some global variables into MigrationIncomingState and MigrationState
- Move some cleanup work form colo thread and colo incoming thread into failover
  BH function and also fix the code logic for the cleanup work.
- fix the bug that colo thread and colo incoming thread possibly block in the
  socket 'recv' call when do failover work.
- Optimize colo_flush_ram_cache()
- Add migration state for incoming side, we use the state to verify if migration
  incoming side is in COLO state or not (Patch 5).
- Drop the patch 'COLO: Disable qdev hotplug when VM is in COLO mode', since it is not correct. 

zhanghailiang (34):
  configure: Add parameter for configure to enable/disable COLO support
  migration: Introduce capability 'colo' to migration
  COLO: migrate colo related info to slave
  colo-comm/migration: skip colo info section for special cases
  migration: Add state records for migration incoming
  migration: Integrate COLO checkpoint process into migration
  migration: Integrate COLO checkpoint process into loadvm
  COLO: Implement colo checkpoint protocol
  COLO: Add a new RunState RUN_STATE_COLO
  QEMUSizedBuffer: Introduce two help functions for qsb
  COLO: Save VM state to slave when do checkpoint
  COLO RAM: Load PVM's dirty page into SVM's RAM cache temporarily
  COLO VMstate: Load VM state into qsb before restore it
  arch_init: Start to trace dirty pages of SVM
  COLO RAM: Flush cached RAM into SVM's memory
  COLO failover: Introduce a new command to trigger a failover
  COLO failover: Introduce state to record failover process
  COLO failover: Implement COLO primary/secondary vm failover work
  qmp event: Add event notification for COLO error
  COLO failover: Don't do failover during loading VM's state
  COLO: Add new command parameter 'forward_nic' 'colo_script' for net
  COLO NIC: Init/remove colo nic devices when add/cleanup tap devices
  tap: Make launch_script() public
  COLO NIC: Implement colo nic device interface configure()
  colo-nic: Handle secondary VM's original net device configure
  COLO NIC: Implement colo nic init/destroy function
  COLO NIC: Some init work related with proxy module
  COLO: Handle nfnetlink message from proxy module
  COLO: Do checkpoint according to the result of packets comparation
  COLO: Improve checkpoint efficiency by do additional periodic
    checkpoint
  COLO: Add colo-set-checkpoint-period command
  COLO NIC: Implement NIC checkpoint and failover
  COLO: Implement shutdown checkpoint
  COLO: Add block replication into colo process

 configure                     |  33 +-
 docs/qmp/qmp-events.txt       |  16 +
 hmp-commands.hx               |  30 ++
 hmp.c                         |  15 +
 hmp.h                         |   2 +
 include/exec/cpu-all.h        |   1 +
 include/migration/colo.h      |  45 +++
 include/migration/failover.h  |  33 ++
 include/migration/migration.h |  19 +
 include/migration/qemu-file.h |   3 +-
 include/net/colo-nic.h        |  37 ++
 include/net/net.h             |   2 +
 include/net/tap.h             |  19 +
 include/sysemu/sysemu.h       |   3 +
 migration/Makefile.objs       |   2 +
 migration/colo-comm.c         |  75 ++++
 migration/colo-failover.c     |  83 +++++
 migration/colo.c              | 805 ++++++++++++++++++++++++++++++++++++++++++
 migration/migration.c         | 116 ++++--
 migration/qemu-file-buf.c     |  58 +++
 migration/ram.c               | 242 ++++++++++++-
 migration/savevm.c            |   2 +-
 net/Makefile.objs             |   1 +
 net/colo-nic.c                | 457 ++++++++++++++++++++++++
 net/net.c                     |   2 +
 net/tap.c                     |  90 +++--
 qapi-schema.json              |  58 ++-
 qapi/event.json               |  15 +
 qemu-options.hx               |   7 +
 qmp-commands.hx               |  42 +++
 scripts/colo-proxy-script.sh  | 145 ++++++++
 stubs/Makefile.objs           |   1 +
 stubs/migration-colo.c        |  58 +++
 trace-events                  |  10 +
 vl.c                          |  37 +-
 35 files changed, 2474 insertions(+), 90 deletions(-)
 create mode 100644 include/migration/colo.h
 create mode 100644 include/migration/failover.h
 create mode 100644 include/net/colo-nic.h
 create mode 100644 migration/colo-comm.c
 create mode 100644 migration/colo-failover.c
 create mode 100644 migration/colo.c
 create mode 100644 net/colo-nic.c
 create mode 100755 scripts/colo-proxy-script.sh
 create mode 100644 stubs/migration-colo.c

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 73+ messages in thread