qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* The problems about COLO
@ 2019-10-31  9:05 Daniel Cho
  2019-11-04 18:37 ` Lukas Straub
  0 siblings, 1 reply; 5+ messages in thread
From: Daniel Cho @ 2019-10-31  9:05 UTC (permalink / raw)
  To: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 460 bytes --]

Hello all,
I have some questions about the COLO.
1)  Could we dynamic set fault tolerance feature on running VM?
In your document, the primary VM could not  start first (if you start
primary VM, the secondary VM will need to start), it means to if I want
this VM with fault-tolerance feature, it needs to be set while we boot it.

2)  If primary VM or secondary VM broke, could we start the third VM to
keep fault tolerance feature?


Best regard,
Daniel Cho.

[-- Attachment #2: Type: text/html, Size: 611 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: The problems about COLO
  2019-10-31  9:05 The problems about COLO Daniel Cho
@ 2019-11-04 18:37 ` Lukas Straub
  2019-11-07  8:14   ` Daniel Cho
  0 siblings, 1 reply; 5+ messages in thread
From: Lukas Straub @ 2019-11-04 18:37 UTC (permalink / raw)
  To: Daniel Cho; +Cc: qemu-devel

On Thu, 31 Oct 2019 17:05:20 +0800
Daniel Cho <danielcho@qnap.com> wrote:

> Hello all,
> I have some questions about the COLO.
> 1)  Could we dynamic set fault tolerance feature on running VM?
> In your document, the primary VM could not  start first (if you start
> primary VM, the secondary VM will need to start), it means to if I
> want this VM with fault-tolerance feature, it needs to be set while
> we boot it.

Hi Daniel,
Yes, this is possible as long you have a quorum block node. The rest
can be added while running.

> 2)  If primary VM or secondary VM broke, could we start the third VM
> to keep fault tolerance feature?

I'm currently working on this, see my latest PATCH series here:
https://lore.kernel.org/qemu-devel/cover.1571925699.git.lukasstraub2@web.de/

>
> Best regard,
> Daniel Cho.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: The problems about COLO
  2019-11-04 18:37 ` Lukas Straub
@ 2019-11-07  8:14   ` Daniel Cho
  2019-11-07 13:34     ` Lukas Straub
  0 siblings, 1 reply; 5+ messages in thread
From: Daniel Cho @ 2019-11-07  8:14 UTC (permalink / raw)
  To: Lukas Straub; +Cc: qemu-devel


[-- Attachment #1.1: Type: text/plain, Size: 10501 bytes --]

Hi  Lukas,
Thanks for your reply.

However, we test the question 1 with steps below the error message, we
notice the secondary VM's image
will break  while it reboots.
Here is the error message.
-------------------------------------------------------------------
[    1.280299] XFS (sda1): Mounting V5 Filesystem
[    1.428418] input: ImExPS/2 Generic Explorer Mouse as
/devices/platform/i8042/serio1/input/input2
[    1.501320] XFS (sda1): Starting recovery (logdev: internal)
[    1.504076] tsc: Refined TSC clocksource calibration: 3492.211 MHz
[    1.505534] Switched to clocksource tsc
[    2.031027] XFS (sda1): Internal error XFS_WANT_CORRUPTED_GOTO at line
1635 of file fs/xfs/libxfs/xfs_alloc.c.  Caller xfs_free_extent+0xfc/0x130
[xfs]
[    2.032743] CPU: 0 PID: 300 Comm: mount Not tainted
3.10.0-693.11.6.el7.x86_64 #1
[    2.033982] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
[    2.035882] Call Trace:
[    2.036494]  [<ffffffff816a5ea1>] dump_stack+0x19/0x1b
[    2.037315]  [<ffffffffc01794eb>] xfs_error_report+0x3b/0x40 [xfs]
[    2.038150]  [<ffffffffc0138e6c>] ? xfs_free_extent+0xfc/0x130 [xfs]
[    2.039046]  [<ffffffffc01362da>] xfs_free_ag_extent+0x20a/0x780 [xfs]
[    2.039920]  [<ffffffffc0138e6c>] xfs_free_extent+0xfc/0x130 [xfs]
[    2.040768]  [<ffffffffc01a7736>] xfs_trans_free_extent+0x26/0x60 [xfs]
[    2.041642]  [<ffffffffc019fade>] xlog_recover_process_efi+0x17e/0x1c0
[xfs]
[    2.042558]  [<ffffffffc01a1e37>]
xlog_recover_process_efis.isra.30+0x77/0xe0 [xfs]
[    2.043771]  [<ffffffffc01a5761>] xlog_recover_finish+0x21/0xb0 [xfs]
[    2.044650]  [<ffffffffc0198894>] xfs_log_mount_finish+0x34/0x50 [xfs]
[    2.045518]  [<ffffffffc018ef21>] xfs_mountfs+0x5d1/0x8b0 [xfs]
[    2.046341]  [<ffffffffc017d220>] ? xfs_filestream_get_parent+0x80/0x80
[xfs]
[    2.047260]  [<ffffffffc0191d6b>] xfs_fs_fill_super+0x3bb/0x4d0 [xfs]
[    2.048116]  [<ffffffff81206ad0>] mount_bdev+0x1b0/0x1f0
[    2.048881]  [<ffffffffc01919b0>] ?
xfs_test_remount_options.isra.11+0x70/0x70 [xfs]
[    2.050105]  [<ffffffffc01906d5>] xfs_fs_mount+0x15/0x20 [xfs]
[    2.050906]  [<ffffffff81207349>] mount_fs+0x39/0x1b0
[    2.051963]  [<ffffffff811a7d45>] ? __alloc_percpu+0x15/0x20
[    2.059431]  [<ffffffff81223f77>] vfs_kern_mount+0x67/0x110
[    2.060283]  [<ffffffff81226483>] do_mount+0x233/0xaf0
[    2.061081]  [<ffffffff811a2cfb>] ? strndup_user+0x4b/0xa0
[    2.061844]  [<ffffffff812270c6>] SyS_mount+0x96/0xf0
[    2.062619]  [<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b
[    2.063512] XFS (sda1): Internal error xfs_trans_cancel at line 984 of
file fs/xfs/xfs_trans.c.  Caller xlog_recover_process_efi+0x18e/0x1c0 [xfs]
[    2.065260] CPU: 0 PID: 300 Comm: mount Not tainted
3.10.0-693.11.6.el7.x86_64 #1
[    2.066489] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
[    2.068023] Call Trace:
[    2.068590]  [<ffffffff816a5ea1>] dump_stack+0x19/0x1b
[    2.069403]  [<ffffffffc01794eb>] xfs_error_report+0x3b/0x40 [xfs]
[    2.070318]  [<ffffffffc019faee>] ? xlog_recover_process_efi+0x18e/0x1c0
[xfs]
[    2.071538]  [<ffffffffc019594d>] xfs_trans_cancel+0xbd/0xe0 [xfs]
[    2.072429]  [<ffffffffc019faee>] xlog_recover_process_efi+0x18e/0x1c0
[xfs]
[    2.073339]  [<ffffffffc01a1e37>]
xlog_recover_process_efis.isra.30+0x77/0xe0 [xfs]
[    2.074561]  [<ffffffffc01a5761>] xlog_recover_finish+0x21/0xb0 [xfs]
[    2.075421]  [<ffffffffc0198894>] xfs_log_mount_finish+0x34/0x50 [xfs]
[    2.076301]  [<ffffffffc018ef21>] xfs_mountfs+0x5d1/0x8b0 [xfs]
[    2.077128]  [<ffffffffc017d220>] ? xfs_filestream_get_parent+0x80/0x80
[xfs]
[    2.078049]  [<ffffffffc0191d6b>] xfs_fs_fill_super+0x3bb/0x4d0 [xfs]
[    2.078900]  [<ffffffff81206ad0>] mount_bdev+0x1b0/0x1f0
[    2.079667]  [<ffffffffc01919b0>] ?
xfs_test_remount_options.isra.11+0x70/0x70 [xfs]
[    2.080883]  [<ffffffffc01906d5>] xfs_fs_mount+0x15/0x20 [xfs]
[    2.081687]  [<ffffffff81207349>] mount_fs+0x39/0x1b0
[    2.082457]  [<ffffffff811a7d45>] ? __alloc_percpu+0x15/0x20
[    2.083258]  [<ffffffff81223f77>] vfs_kern_mount+0x67/0x110
[    2.084057]  [<ffffffff81226483>] do_mount+0x233/0xaf0
[    2.084797]  [<ffffffff811a2cfb>] ? strndup_user+0x4b/0xa0
[    2.085568]  [<ffffffff812270c6>] SyS_mount+0x96/0xf0
[    2.086324]  [<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b
[    2.087161] XFS (sda1): xfs_do_force_shutdown(0x8) called from line 985
of file fs/xfs/xfs_trans.c.  Return address = 0xffffffffc0195966
[    2.088795] XFS (sda1): Corruption of in-memory data detected.  Shutting
down filesystem
[    2.090273] XFS (sda1): Please umount the filesystem and rectify the
problem(s)
[    2.091519] XFS (sda1): Failed to recover EFIs
[    2.092299] XFS (sda1): log mount finish failed
[FAILED] Failed to mount /sysroot.
.
.
.
Generating "/run/initramfs/rdsosreport.txt"
[    2.178103] blk_update_request: I/O error, dev fd0, sector 0
[    2.246106] blk_update_request: I/O error, dev fd0, sector 0
  -------------------------------------------------------------------

Here is the replicated steps:
*1. Start primary VM with command, and do every thing you want on PVM*
        qemu-system-x86_64 -enable-kvm -cpu qemu64,+kvmclock -m 2048 -qmp
stdio -vnc :5 \
  -device piix3-usb-uhci,id=puu -device usb-tablet,id=ut -name primary \
  -netdev
tap,id=hn0,vhost=off,helper=/usr/local/ceph/libexec/qemu-bridge-helper \
  -device rtl8139,id=e0,netdev=hn0 \
  -drive
if=ide,id=colo-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,children.0.file.filename=$image_path,children.0.driver=qcow2
*2. Add the device and object to PVM with qmp command*
      {'execute':'qmp_capabilities'}
      {"execute":"chardev-add", "arguments":{ "id" : "mirror0", "backend" :
{ "type" : "socket", "data" : { "server": true, "wait": false, "addr": {
"type": "inet", "data":{ "host": "127.0.0.1", "port": "9003" } } } } }}
      {"execute":"chardev-add", "arguments":{ "id" : "compare1", "backend"
: { "type" : "socket", "data" : { "server": true, "wait": true, "addr": {
"type": "inet", "data":{ "host": "127.0.0.1", "port": "9004" } } } } }}
      {"execute":"chardev-add", "arguments":{ "id" : "compare0", "backend"
: { "type" : "socket", "data" : { "server": true, "wait": false, "addr": {
"type": "inet", "data":{ "host": "127.0.0.1", "port": "9001" } } } } }}
      {"execute":"chardev-add", "arguments":{ "id" : "compare0-0",
"backend" : { "type" : "socket", "data" : { "server": false, "addr": {
"type": "inet", "data":{ "host": "127.0.0.1", "port": "9001" } } } } }}
      {"execute":"chardev-add", "arguments":{ "id" : "compare_out",
"backend" : { "type" : "socket", "data" : { "server": true, "wait": false,
"addr": { "type": "inet", "data":{ "host": "127.0.0.1", "port": "9005" } }
} } }}
      {"execute":"chardev-add", "arguments":{ "id" : "compare_out0",
"backend" : { "type" : "socket", "data" : { "server": false, "addr": {
"type": "inet", "data":{ "host": "127.0.0.1", "port": "9005" } } } } } }
      {"execute":"object-add", "arguments":{ "qom-type" : "filter-mirror",
"id" : "m0", "props": { "netdev": "hn0", "outdev" : "mirror0", "queue" :
"tx" } } }
      {"execute":"object-add", "arguments":{ "qom-type" :
"filter-redirector", "id" : "redire0", "props": { "netdev": "hn0", "indev"
: "compare_out", "queue" : "rx" } } }
      {"execute":"object-add", "arguments":{ "qom-type" :
"filter-redirector", "id" : "redire1", "props": { "netdev": "hn0", "outdev"
: "compare0", "queue" : "rx" } } }
      {"execute":"object-add", "arguments":{ "qom-type" : "iothread", "id"
: "iothread1", "props": {} } }
      {"execute":"object-add", "arguments":{ "qom-type" : "colo-compare",
"id" : "comp0", "props": { "primary_in" : "compare0-0", "secondary_in" :
"compare1", "outdev" : "compare_out0", "iothread" : "iothread1"} } }
*3. Start the secondary VM with command*
        qemu-system-x86_64 -enable-kvm -cpu qemu64,+kvmclock -m 2048 -qmp
stdio \
  -vnc :6 -device piix3-usb-uhci -device usb-tablet -name secondary \
  -netdev
tap,id=hn0,vhost=off,helper=/usr/local/ceph/libexec/qemu-bridge-helper \
  -device rtl8139,id=e0,netdev=hn0 \
  -chardev socket,id=red0,host=127.0.0.1,port=9003,reconnect=1 \
  -chardev socket,id=red1,host=127.0.0.1,port=9004,reconnect=1 \
  -object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0 \
  -object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1 \
  -object filter-rewriter,id=rew0,netdev=hn0,queue=all \
  -drive if=none,id=colo-disk0,file.filename=$image_path,driver=qcow2,\
node-name=node1 \
  -drive
if=ide,id=active-disk0,driver=replication,mode=secondary,file.driver=qcow2,\
top-id=active-disk0,file.file.filename=active-disk.qcow2,\
file.backing.driver=qcow2,file.backing.file.filename=hidden-disk.qcow2,\
file.backing.backing=colo-disk0,node-name=node2 \
  -incoming tcp:0:9998
*4. As the document create rbd server and do migrate with qmp command*
[image: image.png]
*5. Kill the PVM and failover to SVM*
[image: image.png]
*6. Reboot the secondary VM, then we will get the error.*
It is high possibility to occur this error.

Therefore, we can solve the image problem by *xfs_repair*, then reboot the
VM it will work.
Command:
xfs_repair -L /dev/sda1

Do you have any idea to occur this problem?

Best regard,
Daniel Cho.

Lukas Straub <lukasstraub2@web.de> 於 2019年11月5日 週二 上午2:37寫道:

> On Thu, 31 Oct 2019 17:05:20 +0800
> Daniel Cho <danielcho@qnap.com> wrote:
>
> > Hello all,
> > I have some questions about the COLO.
> > 1)  Could we dynamic set fault tolerance feature on running VM?
> > In your document, the primary VM could not  start first (if you start
> > primary VM, the secondary VM will need to start), it means to if I
> > want this VM with fault-tolerance feature, it needs to be set while
> > we boot it.
>
> Hi Daniel,
> Yes, this is possible as long you have a quorum block node. The rest
> can be added while running.
>
> > 2)  If primary VM or secondary VM broke, could we start the third VM
> > to keep fault tolerance feature?
>
> I'm currently working on this, see my latest PATCH series here:
>
> https://lore.kernel.org/qemu-devel/cover.1571925699.git.lukasstraub2@web.de/
>
> >
> > Best regard,
> > Daniel Cho.
>
>

[-- Attachment #1.2: Type: text/html, Size: 14350 bytes --]

[-- Attachment #2: image.png --]
[-- Type: image/png, Size: 146997 bytes --]

[-- Attachment #3: image.png --]
[-- Type: image/png, Size: 21299 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: The problems about COLO
  2019-11-07  8:14   ` Daniel Cho
@ 2019-11-07 13:34     ` Lukas Straub
  2019-11-08  3:43       ` Daniel Cho
  0 siblings, 1 reply; 5+ messages in thread
From: Lukas Straub @ 2019-11-07 13:34 UTC (permalink / raw)
  To: Daniel Cho; +Cc: qemu-devel

On Thu, 7 Nov 2019 16:14:43 +0800
Daniel Cho <danielcho@qnap.com> wrote:

> Hi  Lukas,
> Thanks for your reply.
>
> However, we test the question 1 with steps below the error message, we
> notice the secondary VM's image
> will break  while it reboots.
> Here is the error message.
> -------------------------------------------------------------------
> [    1.280299] XFS (sda1): Mounting V5 Filesystem
> [    1.428418] input: ImExPS/2 Generic Explorer Mouse as
> /devices/platform/i8042/serio1/input/input2
> [    1.501320] XFS (sda1): Starting recovery (logdev: internal)
> [    1.504076] tsc: Refined TSC clocksource calibration: 3492.211 MHz
> [    1.505534] Switched to clocksource tsc
> [    2.031027] XFS (sda1): Internal error XFS_WANT_CORRUPTED_GOTO at line
> 1635 of file fs/xfs/libxfs/xfs_alloc.c.  Caller xfs_free_extent+0xfc/0x130
> [xfs]
> [    2.032743] CPU: 0 PID: 300 Comm: mount Not tainted
> 3.10.0-693.11.6.el7.x86_64 #1
> [    2.033982] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
> [    2.035882] Call Trace:
> [    2.036494]  [<ffffffff816a5ea1>] dump_stack+0x19/0x1b
> [    2.037315]  [<ffffffffc01794eb>] xfs_error_report+0x3b/0x40 [xfs]
> [    2.038150]  [<ffffffffc0138e6c>] ? xfs_free_extent+0xfc/0x130 [xfs]
> [    2.039046]  [<ffffffffc01362da>] xfs_free_ag_extent+0x20a/0x780 [xfs]
> [    2.039920]  [<ffffffffc0138e6c>] xfs_free_extent+0xfc/0x130 [xfs]
> [    2.040768]  [<ffffffffc01a7736>] xfs_trans_free_extent+0x26/0x60 [xfs]
> [    2.041642]  [<ffffffffc019fade>] xlog_recover_process_efi+0x17e/0x1c0
> [xfs]
> [    2.042558]  [<ffffffffc01a1e37>]
> xlog_recover_process_efis.isra.30+0x77/0xe0 [xfs]
> [    2.043771]  [<ffffffffc01a5761>] xlog_recover_finish+0x21/0xb0 [xfs]
> [    2.044650]  [<ffffffffc0198894>] xfs_log_mount_finish+0x34/0x50 [xfs]
> [    2.045518]  [<ffffffffc018ef21>] xfs_mountfs+0x5d1/0x8b0 [xfs]
> [    2.046341]  [<ffffffffc017d220>] ? xfs_filestream_get_parent+0x80/0x80
> [xfs]
> [    2.047260]  [<ffffffffc0191d6b>] xfs_fs_fill_super+0x3bb/0x4d0 [xfs]
> [    2.048116]  [<ffffffff81206ad0>] mount_bdev+0x1b0/0x1f0
> [    2.048881]  [<ffffffffc01919b0>] ?
> xfs_test_remount_options.isra.11+0x70/0x70 [xfs]
> [    2.050105]  [<ffffffffc01906d5>] xfs_fs_mount+0x15/0x20 [xfs]
> [    2.050906]  [<ffffffff81207349>] mount_fs+0x39/0x1b0
> [    2.051963]  [<ffffffff811a7d45>] ? __alloc_percpu+0x15/0x20
> [    2.059431]  [<ffffffff81223f77>] vfs_kern_mount+0x67/0x110
> [    2.060283]  [<ffffffff81226483>] do_mount+0x233/0xaf0
> [    2.061081]  [<ffffffff811a2cfb>] ? strndup_user+0x4b/0xa0
> [    2.061844]  [<ffffffff812270c6>] SyS_mount+0x96/0xf0
> [    2.062619]  [<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b
> [    2.063512] XFS (sda1): Internal error xfs_trans_cancel at line 984 of
> file fs/xfs/xfs_trans.c.  Caller xlog_recover_process_efi+0x18e/0x1c0 [xfs]
> [    2.065260] CPU: 0 PID: 300 Comm: mount Not tainted
> 3.10.0-693.11.6.el7.x86_64 #1
> [    2.066489] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
> [    2.068023] Call Trace:
> [    2.068590]  [<ffffffff816a5ea1>] dump_stack+0x19/0x1b
> [    2.069403]  [<ffffffffc01794eb>] xfs_error_report+0x3b/0x40 [xfs]
> [    2.070318]  [<ffffffffc019faee>] ? xlog_recover_process_efi+0x18e/0x1c0
> [xfs]
> [    2.071538]  [<ffffffffc019594d>] xfs_trans_cancel+0xbd/0xe0 [xfs]
> [    2.072429]  [<ffffffffc019faee>] xlog_recover_process_efi+0x18e/0x1c0
> [xfs]
> [    2.073339]  [<ffffffffc01a1e37>]
> xlog_recover_process_efis.isra.30+0x77/0xe0 [xfs]
> [    2.074561]  [<ffffffffc01a5761>] xlog_recover_finish+0x21/0xb0 [xfs]
> [    2.075421]  [<ffffffffc0198894>] xfs_log_mount_finish+0x34/0x50 [xfs]
> [    2.076301]  [<ffffffffc018ef21>] xfs_mountfs+0x5d1/0x8b0 [xfs]
> [    2.077128]  [<ffffffffc017d220>] ? xfs_filestream_get_parent+0x80/0x80
> [xfs]
> [    2.078049]  [<ffffffffc0191d6b>] xfs_fs_fill_super+0x3bb/0x4d0 [xfs]
> [    2.078900]  [<ffffffff81206ad0>] mount_bdev+0x1b0/0x1f0
> [    2.079667]  [<ffffffffc01919b0>] ?
> xfs_test_remount_options.isra.11+0x70/0x70 [xfs]
> [    2.080883]  [<ffffffffc01906d5>] xfs_fs_mount+0x15/0x20 [xfs]
> [    2.081687]  [<ffffffff81207349>] mount_fs+0x39/0x1b0
> [    2.082457]  [<ffffffff811a7d45>] ? __alloc_percpu+0x15/0x20
> [    2.083258]  [<ffffffff81223f77>] vfs_kern_mount+0x67/0x110
> [    2.084057]  [<ffffffff81226483>] do_mount+0x233/0xaf0
> [    2.084797]  [<ffffffff811a2cfb>] ? strndup_user+0x4b/0xa0
> [    2.085568]  [<ffffffff812270c6>] SyS_mount+0x96/0xf0
> [    2.086324]  [<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b
> [    2.087161] XFS (sda1): xfs_do_force_shutdown(0x8) called from line 985
> of file fs/xfs/xfs_trans.c.  Return address = 0xffffffffc0195966
> [    2.088795] XFS (sda1): Corruption of in-memory data detected.  Shutting
> down filesystem
> [    2.090273] XFS (sda1): Please umount the filesystem and rectify the
> problem(s)
> [    2.091519] XFS (sda1): Failed to recover EFIs
> [    2.092299] XFS (sda1): log mount finish failed
> [FAILED] Failed to mount /sysroot.
> .
> .
> .
> Generating "/run/initramfs/rdsosreport.txt"
> [    2.178103] blk_update_request: I/O error, dev fd0, sector 0
> [    2.246106] blk_update_request: I/O error, dev fd0, sector 0
>   -------------------------------------------------------------------
>
> Here is the replicated steps:
> *1. Start primary VM with command, and do every thing you want on PVM*
>         qemu-system-x86_64 -enable-kvm -cpu qemu64,+kvmclock -m 2048 -qmp
> stdio -vnc :5 \
>   -device piix3-usb-uhci,id=puu -device usb-tablet,id=ut -name primary \
>   -netdev
> tap,id=hn0,vhost=off,helper=/usr/local/ceph/libexec/qemu-bridge-helper \
>   -device rtl8139,id=e0,netdev=hn0 \
>   -drive
> if=ide,id=colo-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,children.0.file.filename=$image_path,children.0.driver=qcow2
> *2. Add the device and object to PVM with qmp command*
>       {'execute':'qmp_capabilities'}
>       {"execute":"chardev-add", "arguments":{ "id" : "mirror0", "backend" :
> { "type" : "socket", "data" : { "server": true, "wait": false, "addr": {
> "type": "inet", "data":{ "host": "127.0.0.1", "port": "9003" } } } } }}
>       {"execute":"chardev-add", "arguments":{ "id" : "compare1", "backend"
> : { "type" : "socket", "data" : { "server": true, "wait": true, "addr": {
> "type": "inet", "data":{ "host": "127.0.0.1", "port": "9004" } } } } }}
>       {"execute":"chardev-add", "arguments":{ "id" : "compare0", "backend"
> : { "type" : "socket", "data" : { "server": true, "wait": false, "addr": {
> "type": "inet", "data":{ "host": "127.0.0.1", "port": "9001" } } } } }}
>       {"execute":"chardev-add", "arguments":{ "id" : "compare0-0",
> "backend" : { "type" : "socket", "data" : { "server": false, "addr": {
> "type": "inet", "data":{ "host": "127.0.0.1", "port": "9001" } } } } }}
>       {"execute":"chardev-add", "arguments":{ "id" : "compare_out",
> "backend" : { "type" : "socket", "data" : { "server": true, "wait": false,
> "addr": { "type": "inet", "data":{ "host": "127.0.0.1", "port": "9005" } }
> } } }}
>       {"execute":"chardev-add", "arguments":{ "id" : "compare_out0",
> "backend" : { "type" : "socket", "data" : { "server": false, "addr": {
> "type": "inet", "data":{ "host": "127.0.0.1", "port": "9005" } } } } } }
>       {"execute":"object-add", "arguments":{ "qom-type" : "filter-mirror",
> "id" : "m0", "props": { "netdev": "hn0", "outdev" : "mirror0", "queue" :
> "tx" } } }
>       {"execute":"object-add", "arguments":{ "qom-type" :
> "filter-redirector", "id" : "redire0", "props": { "netdev": "hn0", "indev"
> : "compare_out", "queue" : "rx" } } }
>       {"execute":"object-add", "arguments":{ "qom-type" :
> "filter-redirector", "id" : "redire1", "props": { "netdev": "hn0", "outdev"
> : "compare0", "queue" : "rx" } } }
>       {"execute":"object-add", "arguments":{ "qom-type" : "iothread", "id"
> : "iothread1", "props": {} } }
>       {"execute":"object-add", "arguments":{ "qom-type" : "colo-compare",
> "id" : "comp0", "props": { "primary_in" : "compare0-0", "secondary_in" :
> "compare1", "outdev" : "compare_out0", "iothread" : "iothread1"} } }
> *3. Start the secondary VM with command*
>         qemu-system-x86_64 -enable-kvm -cpu qemu64,+kvmclock -m 2048 -qmp
> stdio \
>   -vnc :6 -device piix3-usb-uhci -device usb-tablet -name secondary \
>   -netdev
> tap,id=hn0,vhost=off,helper=/usr/local/ceph/libexec/qemu-bridge-helper \
>   -device rtl8139,id=e0,netdev=hn0 \
>   -chardev socket,id=red0,host=127.0.0.1,port=9003,reconnect=1 \
>   -chardev socket,id=red1,host=127.0.0.1,port=9004,reconnect=1 \
>   -object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0 \
>   -object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1 \
>   -object filter-rewriter,id=rew0,netdev=hn0,queue=all \
>   -drive if=none,id=colo-disk0,file.filename=$image_path,driver=qcow2,\
> node-name=node1 \
>   -drive
> if=ide,id=active-disk0,driver=replication,mode=secondary,file.driver=qcow2,\
> top-id=active-disk0,file.file.filename=active-disk.qcow2,\
> file.backing.driver=qcow2,file.backing.file.filename=hidden-disk.qcow2,\
> file.backing.backing=colo-disk0,node-name=node2 \
>   -incoming tcp:0:9998
> *4. As the document create rbd server and do migrate with qmp command*
> [image: image.png]
> *5. Kill the PVM and failover to SVM*
> [image: image.png]
> *6. Reboot the secondary VM, then we will get the error.*
> It is high possibility to occur this error.
>
> Therefore, we can solve the image problem by *xfs_repair*, then reboot the
> VM it will work.
> Command:
> xfs_repair -L /dev/sda1
>
> Do you have any idea to occur this problem?

Hi Daniel,
The disks have to be synchronized before you can start COLO. So try something like this:

{'execute': 'drive-mirror', 'arguments':{ 'device': 'colo-disk0', 'job-id': 'resync', 'target': 'nbd://SECONDARY:?/colo-disk0', 'mode': 'existing', 'format': 'raw', 'sync': 'full'} }

Then, after the job is ready:
{'execute': 'stop'}
{'execute': 'block-job-cancel', 'arguments':{ 'device': 'resync'} }

And then you can add the replication driver and start colo.

Regards,
Lukas Straub


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: The problems about COLO
  2019-11-07 13:34     ` Lukas Straub
@ 2019-11-08  3:43       ` Daniel Cho
  0 siblings, 0 replies; 5+ messages in thread
From: Daniel Cho @ 2019-11-08  3:43 UTC (permalink / raw)
  To: Lukas Straub; +Cc: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 13028 bytes --]

Lukas Straub <lukasstraub2@web.de> 於 2019年11月7日 週四 下午9:34寫道:

> On Thu, 7 Nov 2019 16:14:43 +0800
> Daniel Cho <danielcho@qnap.com> wrote:
>
> > Hi  Lukas,
> > Thanks for your reply.
> >
> > However, we test the question 1 with steps below the error message, we
> > notice the secondary VM's image
> > will break  while it reboots.
> > Here is the error message.
> > -------------------------------------------------------------------
> > [    1.280299] XFS (sda1): Mounting V5 Filesystem
> > [    1.428418] input: ImExPS/2 Generic Explorer Mouse as
> > /devices/platform/i8042/serio1/input/input2
> > [    1.501320] XFS (sda1): Starting recovery (logdev: internal)
> > [    1.504076] tsc: Refined TSC clocksource calibration: 3492.211 MHz
> > [    1.505534] Switched to clocksource tsc
> > [    2.031027] XFS (sda1): Internal error XFS_WANT_CORRUPTED_GOTO at line
> > 1635 of file fs/xfs/libxfs/xfs_alloc.c.  Caller
> xfs_free_extent+0xfc/0x130
> > [xfs]
> > [    2.032743] CPU: 0 PID: 300 Comm: mount Not tainted
> > 3.10.0-693.11.6.el7.x86_64 #1
> > [    2.033982] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS
> > rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
> > [    2.035882] Call Trace:
> > [    2.036494]  [<ffffffff816a5ea1>] dump_stack+0x19/0x1b
> > [    2.037315]  [<ffffffffc01794eb>] xfs_error_report+0x3b/0x40 [xfs]
> > [    2.038150]  [<ffffffffc0138e6c>] ? xfs_free_extent+0xfc/0x130 [xfs]
> > [    2.039046]  [<ffffffffc01362da>] xfs_free_ag_extent+0x20a/0x780 [xfs]
> > [    2.039920]  [<ffffffffc0138e6c>] xfs_free_extent+0xfc/0x130 [xfs]
> > [    2.040768]  [<ffffffffc01a7736>] xfs_trans_free_extent+0x26/0x60
> [xfs]
> > [    2.041642]  [<ffffffffc019fade>] xlog_recover_process_efi+0x17e/0x1c0
> > [xfs]
> > [    2.042558]  [<ffffffffc01a1e37>]
> > xlog_recover_process_efis.isra.30+0x77/0xe0 [xfs]
> > [    2.043771]  [<ffffffffc01a5761>] xlog_recover_finish+0x21/0xb0 [xfs]
> > [    2.044650]  [<ffffffffc0198894>] xfs_log_mount_finish+0x34/0x50 [xfs]
> > [    2.045518]  [<ffffffffc018ef21>] xfs_mountfs+0x5d1/0x8b0 [xfs]
> > [    2.046341]  [<ffffffffc017d220>] ?
> xfs_filestream_get_parent+0x80/0x80
> > [xfs]
> > [    2.047260]  [<ffffffffc0191d6b>] xfs_fs_fill_super+0x3bb/0x4d0 [xfs]
> > [    2.048116]  [<ffffffff81206ad0>] mount_bdev+0x1b0/0x1f0
> > [    2.048881]  [<ffffffffc01919b0>] ?
> > xfs_test_remount_options.isra.11+0x70/0x70 [xfs]
> > [    2.050105]  [<ffffffffc01906d5>] xfs_fs_mount+0x15/0x20 [xfs]
> > [    2.050906]  [<ffffffff81207349>] mount_fs+0x39/0x1b0
> > [    2.051963]  [<ffffffff811a7d45>] ? __alloc_percpu+0x15/0x20
> > [    2.059431]  [<ffffffff81223f77>] vfs_kern_mount+0x67/0x110
> > [    2.060283]  [<ffffffff81226483>] do_mount+0x233/0xaf0
> > [    2.061081]  [<ffffffff811a2cfb>] ? strndup_user+0x4b/0xa0
> > [    2.061844]  [<ffffffff812270c6>] SyS_mount+0x96/0xf0
> > [    2.062619]  [<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b
> > [    2.063512] XFS (sda1): Internal error xfs_trans_cancel at line 984 of
> > file fs/xfs/xfs_trans.c.  Caller xlog_recover_process_efi+0x18e/0x1c0
> [xfs]
> > [    2.065260] CPU: 0 PID: 300 Comm: mount Not tainted
> > 3.10.0-693.11.6.el7.x86_64 #1
> > [    2.066489] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS
> > rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
> > [    2.068023] Call Trace:
> > [    2.068590]  [<ffffffff816a5ea1>] dump_stack+0x19/0x1b
> > [    2.069403]  [<ffffffffc01794eb>] xfs_error_report+0x3b/0x40 [xfs]
> > [    2.070318]  [<ffffffffc019faee>] ?
> xlog_recover_process_efi+0x18e/0x1c0
> > [xfs]
> > [    2.071538]  [<ffffffffc019594d>] xfs_trans_cancel+0xbd/0xe0 [xfs]
> > [    2.072429]  [<ffffffffc019faee>] xlog_recover_process_efi+0x18e/0x1c0
> > [xfs]
> > [    2.073339]  [<ffffffffc01a1e37>]
> > xlog_recover_process_efis.isra.30+0x77/0xe0 [xfs]
> > [    2.074561]  [<ffffffffc01a5761>] xlog_recover_finish+0x21/0xb0 [xfs]
> > [    2.075421]  [<ffffffffc0198894>] xfs_log_mount_finish+0x34/0x50 [xfs]
> > [    2.076301]  [<ffffffffc018ef21>] xfs_mountfs+0x5d1/0x8b0 [xfs]
> > [    2.077128]  [<ffffffffc017d220>] ?
> xfs_filestream_get_parent+0x80/0x80
> > [xfs]
> > [    2.078049]  [<ffffffffc0191d6b>] xfs_fs_fill_super+0x3bb/0x4d0 [xfs]
> > [    2.078900]  [<ffffffff81206ad0>] mount_bdev+0x1b0/0x1f0
> > [    2.079667]  [<ffffffffc01919b0>] ?
> > xfs_test_remount_options.isra.11+0x70/0x70 [xfs]
> > [    2.080883]  [<ffffffffc01906d5>] xfs_fs_mount+0x15/0x20 [xfs]
> > [    2.081687]  [<ffffffff81207349>] mount_fs+0x39/0x1b0
> > [    2.082457]  [<ffffffff811a7d45>] ? __alloc_percpu+0x15/0x20
> > [    2.083258]  [<ffffffff81223f77>] vfs_kern_mount+0x67/0x110
> > [    2.084057]  [<ffffffff81226483>] do_mount+0x233/0xaf0
> > [    2.084797]  [<ffffffff811a2cfb>] ? strndup_user+0x4b/0xa0
> > [    2.085568]  [<ffffffff812270c6>] SyS_mount+0x96/0xf0
> > [    2.086324]  [<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b
> > [    2.087161] XFS (sda1): xfs_do_force_shutdown(0x8) called from line
> 985
> > of file fs/xfs/xfs_trans.c.  Return address = 0xffffffffc0195966
> > [    2.088795] XFS (sda1): Corruption of in-memory data detected.
> Shutting
> > down filesystem
> > [    2.090273] XFS (sda1): Please umount the filesystem and rectify the
> > problem(s)
> > [    2.091519] XFS (sda1): Failed to recover EFIs
> > [    2.092299] XFS (sda1): log mount finish failed
> > [FAILED] Failed to mount /sysroot.
> > .
> > .
> > .
> > Generating "/run/initramfs/rdsosreport.txt"
> > [    2.178103] blk_update_request: I/O error, dev fd0, sector 0
> > [    2.246106] blk_update_request: I/O error, dev fd0, sector 0
> >   -------------------------------------------------------------------
> >
> > Here is the replicated steps:
> > *1. Start primary VM with command, and do every thing you want on PVM*
> >         qemu-system-x86_64 -enable-kvm -cpu qemu64,+kvmclock -m 2048 -qmp
> > stdio -vnc :5 \
> >   -device piix3-usb-uhci,id=puu -device usb-tablet,id=ut -name primary \
> >   -netdev
> > tap,id=hn0,vhost=off,helper=/usr/local/ceph/libexec/qemu-bridge-helper \
> >   -device rtl8139,id=e0,netdev=hn0 \
> >   -drive
> >
> if=ide,id=colo-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,children.0.file.filename=$image_path,children.0.driver=qcow2
> > *2. Add the device and object to PVM with qmp command*
> >       {'execute':'qmp_capabilities'}
> >       {"execute":"chardev-add", "arguments":{ "id" : "mirror0",
> "backend" :
> > { "type" : "socket", "data" : { "server": true, "wait": false, "addr": {
> > "type": "inet", "data":{ "host": "127.0.0.1", "port": "9003" } } } } }}
> >       {"execute":"chardev-add", "arguments":{ "id" : "compare1",
> "backend"
> > : { "type" : "socket", "data" : { "server": true, "wait": true, "addr": {
> > "type": "inet", "data":{ "host": "127.0.0.1", "port": "9004" } } } } }}
> >       {"execute":"chardev-add", "arguments":{ "id" : "compare0",
> "backend"
> > : { "type" : "socket", "data" : { "server": true, "wait": false, "addr":
> {
> > "type": "inet", "data":{ "host": "127.0.0.1", "port": "9001" } } } } }}
> >       {"execute":"chardev-add", "arguments":{ "id" : "compare0-0",
> > "backend" : { "type" : "socket", "data" : { "server": false, "addr": {
> > "type": "inet", "data":{ "host": "127.0.0.1", "port": "9001" } } } } }}
> >       {"execute":"chardev-add", "arguments":{ "id" : "compare_out",
> > "backend" : { "type" : "socket", "data" : { "server": true, "wait":
> false,
> > "addr": { "type": "inet", "data":{ "host": "127.0.0.1", "port": "9005" }
> }
> > } } }}
> >       {"execute":"chardev-add", "arguments":{ "id" : "compare_out0",
> > "backend" : { "type" : "socket", "data" : { "server": false, "addr": {
> > "type": "inet", "data":{ "host": "127.0.0.1", "port": "9005" } } } } } }
> >       {"execute":"object-add", "arguments":{ "qom-type" :
> "filter-mirror",
> > "id" : "m0", "props": { "netdev": "hn0", "outdev" : "mirror0", "queue" :
> > "tx" } } }
> >       {"execute":"object-add", "arguments":{ "qom-type" :
> > "filter-redirector", "id" : "redire0", "props": { "netdev": "hn0",
> "indev"
> > : "compare_out", "queue" : "rx" } } }
> >       {"execute":"object-add", "arguments":{ "qom-type" :
> > "filter-redirector", "id" : "redire1", "props": { "netdev": "hn0",
> "outdev"
> > : "compare0", "queue" : "rx" } } }
> >       {"execute":"object-add", "arguments":{ "qom-type" : "iothread",
> "id"
> > : "iothread1", "props": {} } }
> >       {"execute":"object-add", "arguments":{ "qom-type" : "colo-compare",
> > "id" : "comp0", "props": { "primary_in" : "compare0-0", "secondary_in" :
> > "compare1", "outdev" : "compare_out0", "iothread" : "iothread1"} } }
> > *3. Start the secondary VM with command*
> >         qemu-system-x86_64 -enable-kvm -cpu qemu64,+kvmclock -m 2048 -qmp
> > stdio \
> >   -vnc :6 -device piix3-usb-uhci -device usb-tablet -name secondary \
> >   -netdev
> > tap,id=hn0,vhost=off,helper=/usr/local/ceph/libexec/qemu-bridge-helper \
> >   -device rtl8139,id=e0,netdev=hn0 \
> >   -chardev socket,id=red0,host=127.0.0.1,port=9003,reconnect=1 \
> >   -chardev socket,id=red1,host=127.0.0.1,port=9004,reconnect=1 \
> >   -object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0 \
> >   -object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1 \
> >   -object filter-rewriter,id=rew0,netdev=hn0,queue=all \
> >   -drive if=none,id=colo-disk0,file.filename=$image_path,driver=qcow2,\
> > node-name=node1 \
> >   -drive
> >
> if=ide,id=active-disk0,driver=replication,mode=secondary,file.driver=qcow2,\
> > top-id=active-disk0,file.file.filename=active-disk.qcow2,\
> > file.backing.driver=qcow2,file.backing.file.filename=hidden-disk.qcow2,\
> > file.backing.backing=colo-disk0,node-name=node2 \
> >   -incoming tcp:0:9998
> > *4. As the document create rbd server and do migrate with qmp command*
> > [image: image.png]
> > *5. Kill the PVM and failover to SVM*
> > [image: image.png]
> > *6. Reboot the secondary VM, then we will get the error.*
> > It is high possibility to occur this error.
> >
> > Therefore, we can solve the image problem by *xfs_repair*, then reboot
> the
> > VM it will work.
> > Command:
> > xfs_repair -L /dev/sda1
> >
> > Do you have any idea to occur this problem?
>
> Hi Daniel,
> The disks have to be synchronized before you can start COLO. So try
> something like this:
>
> {'execute': 'drive-mirror', 'arguments':{ 'device': 'colo-disk0',
> 'job-id': 'resync', 'target': 'nbd://SECONDARY:?/colo-disk0', 'mode':
> 'existing', 'format': 'raw', 'sync': 'full'} }
>
> Then, after the job is ready:
> {'execute': 'stop'}
> {'execute': 'block-job-cancel', 'arguments':{ 'device': 'resync'} }
>
> And then you can add the replication driver and start colo.
>
> Regards,
> Lukas Straub
>

Hi Lukas,
      It works well, thanks for your help.

Otherwise, could we change the secondary VM's *replication* driver to *quorum
*driver
to realize  continuously VM replication ?

Here is the start command.
Original :
qemu-system-x86_64 -enable-kvm -cpu qemu64,+kvmclock -m 2048 -qmp stdio \
   -vnc :6 -device piix3-usb-uhci -device usb-tablet -name secondary \

 -netdev  tap,id=hn0,vhost=off,helper=/usr/local/ceph/libexec/qemu-bridge-helper
\
   -device rtl8139,id=e0,netdev=hn0 \
   -chardev socket,id=red0,host=127.0.0.1,port=9003,reconnect=1 \
   -chardev socket,id=red1,host=127.0.0.1,port=9004,reconnect=1 \
   -object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0 \
   -object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1 \
   -object filter-rewriter,id=rew0,netdev=hn0,queue=all \
   -drive if=none,id=colo-disk0,file.filename=$image_path,driver=qcow2,\
 node-name=node1 \

 -drive if=ide,id=active-disk0,driver=replication,mode=secondary,file.driver=qcow2,\
 top-id=active-disk0,file.file.filename=active-disk.qcow2,\
 file.backing.driver=qcow2,file.backing.file.filename=hidden-disk.qcow2,\
 file.backing.backing=colo-disk0,node-name=node2 \
   -incoming tcp:0:9998

Modify :
  qemu-system-x86_64 -enable-kvm -cpu qemu64,+kvmclock -m 2048 -qmp stdio \
   -vnc :6 -device piix3-usb-uhci -device usb-tablet -name secondary \

 -netdev  tap,id=hn0,vhost=off,helper=/usr/local/ceph/libexec/qemu-bridge-helper
\
   -device rtl8139,id=e0,netdev=hn0 \
   -chardev socket,id=red0,host=127.0.0.1,port=9003,reconnect=1 \
   -chardev socket,id=red1,host=127.0.0.1,port=9004,reconnect=1 \
   -object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0 \
   -object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1 \
   -object filter-rewriter,id=rew0,netdev=hn0,queue=all \
   -drive
if=ide,id=colo-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,\
children.0.file.filename=$image_path,children.0.driver=qcow2 \
   -incoming tcp:0:9998

Best regard,
Daniel Cho

[-- Attachment #2: Type: text/html, Size: 17495 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-11-08  3:44 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-31  9:05 The problems about COLO Daniel Cho
2019-11-04 18:37 ` Lukas Straub
2019-11-07  8:14   ` Daniel Cho
2019-11-07 13:34     ` Lukas Straub
2019-11-08  3:43       ` Daniel Cho

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).