Hi  Lukas, 
Thanks for your reply. 

However, we test the question 1 with steps below the error message, we notice the secondary VM's image 
will break  while it reboots. 
Here is the error message.
-------------------------------------------------------------------
[    1.280299] XFS (sda1): Mounting V5 Filesystem
[    1.428418] input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input2
[    1.501320] XFS (sda1): Starting recovery (logdev: internal)
[    1.504076] tsc: Refined TSC clocksource calibration: 3492.211 MHz
[    1.505534] Switched to clocksource tsc
[    2.031027] XFS (sda1): Internal error XFS_WANT_CORRUPTED_GOTO at line 1635 of file fs/xfs/libxfs/xfs_alloc.c.  Caller xfs_free_extent+0xfc/0x130 [xfs]
[    2.032743] CPU: 0 PID: 300 Comm: mount Not tainted 3.10.0-693.11.6.el7.x86_64 #1
[    2.033982] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
[    2.035882] Call Trace:
[    2.036494]  [<ffffffff816a5ea1>] dump_stack+0x19/0x1b
[    2.037315]  [<ffffffffc01794eb>] xfs_error_report+0x3b/0x40 [xfs]
[    2.038150]  [<ffffffffc0138e6c>] ? xfs_free_extent+0xfc/0x130 [xfs]
[    2.039046]  [<ffffffffc01362da>] xfs_free_ag_extent+0x20a/0x780 [xfs]
[    2.039920]  [<ffffffffc0138e6c>] xfs_free_extent+0xfc/0x130 [xfs]
[    2.040768]  [<ffffffffc01a7736>] xfs_trans_free_extent+0x26/0x60 [xfs]
[    2.041642]  [<ffffffffc019fade>] xlog_recover_process_efi+0x17e/0x1c0 [xfs]
[    2.042558]  [<ffffffffc01a1e37>] xlog_recover_process_efis.isra.30+0x77/0xe0 [xfs]
[    2.043771]  [<ffffffffc01a5761>] xlog_recover_finish+0x21/0xb0 [xfs]
[    2.044650]  [<ffffffffc0198894>] xfs_log_mount_finish+0x34/0x50 [xfs]
[    2.045518]  [<ffffffffc018ef21>] xfs_mountfs+0x5d1/0x8b0 [xfs]
[    2.046341]  [<ffffffffc017d220>] ? xfs_filestream_get_parent+0x80/0x80 [xfs]
[    2.047260]  [<ffffffffc0191d6b>] xfs_fs_fill_super+0x3bb/0x4d0 [xfs]
[    2.048116]  [<ffffffff81206ad0>] mount_bdev+0x1b0/0x1f0
[    2.048881]  [<ffffffffc01919b0>] ? xfs_test_remount_options.isra.11+0x70/0x70 [xfs]
[    2.050105]  [<ffffffffc01906d5>] xfs_fs_mount+0x15/0x20 [xfs]
[    2.050906]  [<ffffffff81207349>] mount_fs+0x39/0x1b0
[    2.051963]  [<ffffffff811a7d45>] ? __alloc_percpu+0x15/0x20
[    2.059431]  [<ffffffff81223f77>] vfs_kern_mount+0x67/0x110
[    2.060283]  [<ffffffff81226483>] do_mount+0x233/0xaf0
[    2.061081]  [<ffffffff811a2cfb>] ? strndup_user+0x4b/0xa0
[    2.061844]  [<ffffffff812270c6>] SyS_mount+0x96/0xf0
[    2.062619]  [<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b
[    2.063512] XFS (sda1): Internal error xfs_trans_cancel at line 984 of file fs/xfs/xfs_trans.c.  Caller xlog_recover_process_efi+0x18e/0x1c0 [xfs]
[    2.065260] CPU: 0 PID: 300 Comm: mount Not tainted 3.10.0-693.11.6.el7.x86_64 #1
[    2.066489] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
[    2.068023] Call Trace:
[    2.068590]  [<ffffffff816a5ea1>] dump_stack+0x19/0x1b
[    2.069403]  [<ffffffffc01794eb>] xfs_error_report+0x3b/0x40 [xfs]
[    2.070318]  [<ffffffffc019faee>] ? xlog_recover_process_efi+0x18e/0x1c0 [xfs]
[    2.071538]  [<ffffffffc019594d>] xfs_trans_cancel+0xbd/0xe0 [xfs]
[    2.072429]  [<ffffffffc019faee>] xlog_recover_process_efi+0x18e/0x1c0 [xfs]
[    2.073339]  [<ffffffffc01a1e37>] xlog_recover_process_efis.isra.30+0x77/0xe0 [xfs]
[    2.074561]  [<ffffffffc01a5761>] xlog_recover_finish+0x21/0xb0 [xfs]
[    2.075421]  [<ffffffffc0198894>] xfs_log_mount_finish+0x34/0x50 [xfs]
[    2.076301]  [<ffffffffc018ef21>] xfs_mountfs+0x5d1/0x8b0 [xfs]
[    2.077128]  [<ffffffffc017d220>] ? xfs_filestream_get_parent+0x80/0x80 [xfs]
[    2.078049]  [<ffffffffc0191d6b>] xfs_fs_fill_super+0x3bb/0x4d0 [xfs]
[    2.078900]  [<ffffffff81206ad0>] mount_bdev+0x1b0/0x1f0
[    2.079667]  [<ffffffffc01919b0>] ? xfs_test_remount_options.isra.11+0x70/0x70 [xfs]
[    2.080883]  [<ffffffffc01906d5>] xfs_fs_mount+0x15/0x20 [xfs]
[    2.081687]  [<ffffffff81207349>] mount_fs+0x39/0x1b0
[    2.082457]  [<ffffffff811a7d45>] ? __alloc_percpu+0x15/0x20
[    2.083258]  [<ffffffff81223f77>] vfs_kern_mount+0x67/0x110
[    2.084057]  [<ffffffff81226483>] do_mount+0x233/0xaf0
[    2.084797]  [<ffffffff811a2cfb>] ? strndup_user+0x4b/0xa0
[    2.085568]  [<ffffffff812270c6>] SyS_mount+0x96/0xf0
[    2.086324]  [<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b
[    2.087161] XFS (sda1): xfs_do_force_shutdown(0x8) called from line 985 of file fs/xfs/xfs_trans.c.  Return address = 0xffffffffc0195966
[    2.088795] XFS (sda1): Corruption of in-memory data detected.  Shutting down filesystem
[    2.090273] XFS (sda1): Please umount the filesystem and rectify the problem(s)
[    2.091519] XFS (sda1): Failed to recover EFIs
[    2.092299] XFS (sda1): log mount finish failed
[FAILED] Failed to mount /sysroot.
.
.
.
Generating "/run/initramfs/rdsosreport.txt"
[    2.178103] blk_update_request: I/O error, dev fd0, sector 0
[    2.246106] blk_update_request: I/O error, dev fd0, sector 0
  -------------------------------------------------------------------  

Here is the replicated steps:
1. Start primary VM with command, and do every thing you want on PVM
        qemu-system-x86_64 -enable-kvm -cpu qemu64,+kvmclock -m 2048 -qmp stdio -vnc :5 \
  -device piix3-usb-uhci,id=puu -device usb-tablet,id=ut -name primary \
  -netdev tap,id=hn0,vhost=off,helper=/usr/local/ceph/libexec/qemu-bridge-helper \
  -device rtl8139,id=e0,netdev=hn0 \
  -drive if=ide,id=colo-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,children.0.file.filename=$image_path,children.0.driver=qcow2
2. Add the device and object to PVM with qmp command
      {'execute':'qmp_capabilities'}
      {"execute":"chardev-add", "arguments":{ "id" : "mirror0", "backend" : { "type" : "socket", "data" : { "server": true, "wait": false, "addr": { "type": "inet", "data":{ "host": "127.0.0.1", "port": "9003" } } } } }}
      {"execute":"chardev-add", "arguments":{ "id" : "compare1", "backend" : { "type" : "socket", "data" : { "server": true, "wait": true, "addr": { "type": "inet", "data":{ "host": "127.0.0.1", "port": "9004" } } } } }}
      {"execute":"chardev-add", "arguments":{ "id" : "compare0", "backend" : { "type" : "socket", "data" : { "server": true, "wait": false, "addr": { "type": "inet", "data":{ "host": "127.0.0.1", "port": "9001" } } } } }}
      {"execute":"chardev-add", "arguments":{ "id" : "compare0-0", "backend" : { "type" : "socket", "data" : { "server": false, "addr": { "type": "inet", "data":{ "host": "127.0.0.1", "port": "9001" } } } } }}
      {"execute":"chardev-add", "arguments":{ "id" : "compare_out", "backend" : { "type" : "socket", "data" : { "server": true, "wait": false, "addr": { "type": "inet", "data":{ "host": "127.0.0.1", "port": "9005" } } } } }}
      {"execute":"chardev-add", "arguments":{ "id" : "compare_out0", "backend" : { "type" : "socket", "data" : { "server": false, "addr": { "type": "inet", "data":{ "host": "127.0.0.1", "port": "9005" } } } } } }
      {"execute":"object-add", "arguments":{ "qom-type" : "filter-mirror", "id" : "m0", "props": { "netdev": "hn0", "outdev" : "mirror0", "queue" : "tx" } } }
      {"execute":"object-add", "arguments":{ "qom-type" : "filter-redirector", "id" : "redire0", "props": { "netdev": "hn0", "indev" : "compare_out", "queue" : "rx" } } }
      {"execute":"object-add", "arguments":{ "qom-type" : "filter-redirector", "id" : "redire1", "props": { "netdev": "hn0", "outdev" : "compare0", "queue" : "rx" } } }
      {"execute":"object-add", "arguments":{ "qom-type" : "iothread", "id" : "iothread1", "props": {} } }
      {"execute":"object-add", "arguments":{ "qom-type" : "colo-compare", "id" : "comp0", "props": { "primary_in" : "compare0-0", "secondary_in" : "compare1", "outdev" : "compare_out0", "iothread" : "iothread1"} } }
3. Start the secondary VM with command
        qemu-system-x86_64 -enable-kvm -cpu qemu64,+kvmclock -m 2048 -qmp stdio \
  -vnc :6 -device piix3-usb-uhci -device usb-tablet -name secondary \
  -netdev tap,id=hn0,vhost=off,helper=/usr/local/ceph/libexec/qemu-bridge-helper \
  -device rtl8139,id=e0,netdev=hn0 \
  -chardev socket,id=red0,host=127.0.0.1,port=9003,reconnect=1 \
  -chardev socket,id=red1,host=127.0.0.1,port=9004,reconnect=1 \
  -object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0 \
  -object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1 \
  -object filter-rewriter,id=rew0,netdev=hn0,queue=all \
  -drive if=none,id=colo-disk0,file.filename=$image_path,driver=qcow2,\
node-name=node1 \
  -drive if=ide,id=active-disk0,driver=replication,mode=secondary,file.driver=qcow2,\
top-id=active-disk0,file.file.filename=active-disk.qcow2,\
file.backing.driver=qcow2,file.backing.file.filename=hidden-disk.qcow2,\
file.backing.backing=colo-disk0,node-name=node2 \
  -incoming tcp:0:9998
4. As the document create rbd server and do migrate with qmp command
image.png
5. Kill the PVM and failover to SVM
image.png
6. Reboot the secondary VM, then we will get the error.
It is high possibility to occur this error.

Therefore, we can solve the image problem by xfs_repair, then reboot the VM it will work.
Command:
xfs_repair -L /dev/sda1

Do you have any idea to occur this problem?

Best regard,
Daniel Cho.

Lukas Straub <lukasstraub2@web.de> 於 2019年11月5日 週二 上午2:37寫道:
On Thu, 31 Oct 2019 17:05:20 +0800
Daniel Cho <danielcho@qnap.com> wrote:

> Hello all,
> I have some questions about the COLO.
> 1)  Could we dynamic set fault tolerance feature on running VM?
> In your document, the primary VM could not  start first (if you start
> primary VM, the secondary VM will need to start), it means to if I
> want this VM with fault-tolerance feature, it needs to be set while
> we boot it.

Hi Daniel,
Yes, this is possible as long you have a quorum block node. The rest
can be added while running.

> 2)  If primary VM or secondary VM broke, could we start the third VM
> to keep fault tolerance feature?

I'm currently working on this, see my latest PATCH series here:
https://lore.kernel.org/qemu-devel/cover.1571925699.git.lukasstraub2@web.de/

>
> Best regard,
> Daniel Cho.