* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
@ 2017-08-18 14:15 ` ChristianEhrhardt
2017-08-18 14:22 ` ChristianEhrhardt
` (47 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: ChristianEhrhardt @ 2017-08-18 14:15 UTC (permalink / raw)
To: qemu-devel
I reached out to the people involved in the initial fixes which were related to image locking and qemu-nbd. But this might after all be something completely different.
Yet until we know better it might be wiser to reach out to more people.
=> http://lists.nongnu.org/archive/html/qemu-devel/2017-08/msg03465.html
** Also affects: qemu (Ubuntu)
Importance: Undecided
Status: New
** Also affects: qemu
Importance: Undecided
Status: New
** Changed in: qemu (Ubuntu)
Assignee: (unassigned) => ChristianEhrhardt (paelzer)
** Changed in: qemu (Ubuntu)
Status: New => Confirmed
** Changed in: qemu (Ubuntu)
Importance: Undecided => High
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
2017-08-18 14:15 ` [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10 ChristianEhrhardt
@ 2017-08-18 14:22 ` ChristianEhrhardt
2017-08-18 15:03 ` ChristianEhrhardt
` (46 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: ChristianEhrhardt @ 2017-08-18 14:22 UTC (permalink / raw)
To: qemu-devel
The source log is virsh, I need to ensure we also have a source libvirtd
log ...
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
2017-08-18 14:15 ` [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10 ChristianEhrhardt
2017-08-18 14:22 ` ChristianEhrhardt
@ 2017-08-18 15:03 ` ChristianEhrhardt
2017-08-18 15:03 ` ChristianEhrhardt
` (45 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: ChristianEhrhardt @ 2017-08-18 15:03 UTC (permalink / raw)
To: qemu-devel
Since this is pretty reproducible here on the setup:
- Two systems (actually two lxd containers on one system)
- Both running Artful with qemu 2.10-rc3 and libvirt 3.6
- Storage path is not shared but set up equivalent with a manual pre-copy
- Migration with post copy is failing, no other options set, example:
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.100/system
- Same setup works on the qemu versions in Xenial (2.5), Yakkety (2.6), and Zesty (2.8)
- In fact it seems even a migration from a Zesty qemu (2.8) to the new (2.10) works
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (2 preceding siblings ...)
2017-08-18 15:03 ` ChristianEhrhardt
@ 2017-08-18 15:03 ` ChristianEhrhardt
2017-08-18 15:04 ` ChristianEhrhardt
` (44 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: ChristianEhrhardt @ 2017-08-18 15:03 UTC (permalink / raw)
To: qemu-devel
To simplify downloading the logs I'm attaching here a full set of:
- virsh
- source libvirtd
- target libvirtd
** Attachment added: "virsh-source.log"
https://bugs.launchpad.net/qemu/+bug/1711602/+attachment/4934823/+files/virsh-source.log
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (3 preceding siblings ...)
2017-08-18 15:03 ` ChristianEhrhardt
@ 2017-08-18 15:04 ` ChristianEhrhardt
2017-08-18 15:04 ` ChristianEhrhardt
` (43 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: ChristianEhrhardt @ 2017-08-18 15:04 UTC (permalink / raw)
To: qemu-devel
** Attachment added: "libvirtd-target.log"
https://bugs.launchpad.net/qemu/+bug/1711602/+attachment/4934824/+files/libvirtd-target.log
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (4 preceding siblings ...)
2017-08-18 15:04 ` ChristianEhrhardt
@ 2017-08-18 15:04 ` ChristianEhrhardt
2017-08-18 15:06 ` ChristianEhrhardt
` (42 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: ChristianEhrhardt @ 2017-08-18 15:04 UTC (permalink / raw)
To: qemu-devel
** Attachment added: "libvirtd-source.log"
https://bugs.launchpad.net/qemu/+bug/1711602/+attachment/4934825/+files/libvirtd-source.log
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (5 preceding siblings ...)
2017-08-18 15:04 ` ChristianEhrhardt
@ 2017-08-18 15:06 ` ChristianEhrhardt
2017-08-18 15:09 ` ChristianEhrhardt
` (41 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: ChristianEhrhardt @ 2017-08-18 15:06 UTC (permalink / raw)
To: qemu-devel
I've seen something in the logs which I want to eliminate from the list of possibilities:
"warning: host doesn't support requested feature: CPUID.80000001H:ECX.svm [bit 2]"
We had always a patch I questioned to enable svm capabilitiy for guests in general, it worked all the time but I'd have preferred to be an explicit user opt-in.
I remember seeing the warning in the past which made me neglect it at first, but maybe the target capability check is now more strict.
I'll drop this change for a test build and run all that again to be sure.
I doubt that is the reason, but let verifying this particular lead be my task - please be open with other suggestions.
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (6 preceding siblings ...)
2017-08-18 15:06 ` ChristianEhrhardt
@ 2017-08-18 15:09 ` ChristianEhrhardt
2017-08-18 17:56 ` Dr. David Alan Gilbert
` (40 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: ChristianEhrhardt @ 2017-08-18 15:09 UTC (permalink / raw)
To: qemu-devel
Currently I plan to test with the svm/vmx changes disabled as well as a
cross test on ppc64 and s390x which might complete the picture.
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (7 preceding siblings ...)
2017-08-18 15:09 ` ChristianEhrhardt
@ 2017-08-18 17:56 ` Dr. David Alan Gilbert
2017-08-21 14:55 ` ChristianEhrhardt
` (39 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: Dr. David Alan Gilbert @ 2017-08-18 17:56 UTC (permalink / raw)
To: qemu-devel
The 'host doesn't support requested feature' is probably a red-herring in this case
The fact it's failing with an IO error but nothing else suggests either:
a) it's something closing the socket between the two qemu's
b) The I/O error is from storage/NBD
Assuming it fails on precopy, I'd look at the qemu_loadvm_state_section_startfull to watch all the device states load.
You could also add some debug/tracing in qemu_loadvm_state to see at what point it fails.
Dave
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (8 preceding siblings ...)
2017-08-18 17:56 ` Dr. David Alan Gilbert
@ 2017-08-21 14:55 ` ChristianEhrhardt
2017-08-21 14:55 ` ChristianEhrhardt
` (38 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: ChristianEhrhardt @ 2017-08-21 14:55 UTC (permalink / raw)
To: qemu-devel
Hi David,
confirming the red-herring on the cpu feature - I had a build without runnign over the weekend so this was easy to test - and still the migration fails.
I have about 7 seconds from kicking off the migration until the sync seems to pass its first phase and then qemu is exiting (at least that is what libvirt thinks):
"closed without SHUTDOWN event; assuming the domain crashed"
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (9 preceding siblings ...)
2017-08-21 14:55 ` ChristianEhrhardt
@ 2017-08-21 14:55 ` ChristianEhrhardt
2017-08-21 14:55 ` ChristianEhrhardt
` (37 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: ChristianEhrhardt @ 2017-08-21 14:55 UTC (permalink / raw)
To: qemu-devel
Since the qemu "lives" in that time I can try to debug what happens.
With strace to sniff where things could be I see right before the end:
0.000203 recvmsg(27, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="", iov_len=32768}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 0 <0.000014>
0.000049 futex(0xca65dacf4, FUTEX_CMP_REQUEUE_PRIVATE, 1, 2147483647, 0xca4785a80, 20) = 1 <0.000016>
0.000038 getpid() = 29750 <0.000023>
0.000011 tgkill(29750, 29760, SIGUSR1) = 0 <0.000030>
0.000012 futex(0xca4785a80, FUTEX_WAKE_PRIVATE, 1) = 1 <0.000048>
0.000010 futex(0xca47b46e4, FUTEX_WAIT_PRIVATE, 19, NULL) = 0 <0.002215>
0.000032 sendmsg(21, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="{\"timestamp\": {\"seconds\": 1503322067, \"microseconds\": 613178}, \"event\": \"MIGRATION\", \"data\": {\"status\": \"failed\"}}\r\n", iov_len=116}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 116 <0.000024>
0.000074 write(2, "2017-08-21T13:27:47.613276Z qemu-system-x86_64: load of migration failed: Input/output error\n", 93) = 93 <0.000022>
0.000055 close(27) = 0 <0.000090>
Now 29750 is the main process/tgid and 29760 is the third process started on the migration.
It is the one that does the vcpu ioctl's so I assume this is just the one representing the vpu.
Well gdb should be more useful so looking with that.
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (10 preceding siblings ...)
2017-08-21 14:55 ` ChristianEhrhardt
@ 2017-08-21 14:55 ` ChristianEhrhardt
2017-08-21 14:56 ` ChristianEhrhardt
` (36 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: ChristianEhrhardt @ 2017-08-21 14:55 UTC (permalink / raw)
To: qemu-devel
As expected by David when I trace on process_incoming_migration_co which prints the "readable" error I see the error pop up on "qemu_loadvm_state"
It appears as "Thread 4 "CPU 0/KVM" received signal SIGUSR1" and similar which is just the break down of the guest.
Diving "into" qemu_loadvm_state reveals that it gets until "cpu_synchronize_all_pre_loadvm".
In qemu_loadvm_state none of the initial checks fail.
Then the "ret = vmstate_load_state(f, &vmstate_configuration, &savevm_state, 0);" seems to work fine was well.
It seems reproducible in "cpu_synchronize_all_pre_loadvm" where the crash happens.
I can catch the incoming qemu easily with:
$ while ! pid=$(pidof qemu-system-x86_64); do /bin/true; done; gdb --pid ${pid}
# Then in gdb break on "cpu_synchronize_all_pre_loadvm"
# And when I step over it I the next thing I see is the "beginning of the end" for the process
Thread 4 "CPU 0/KVM" received signal SIGUSR1, User defined signal 1.
[Switching to Thread 0x7f418136e700 (LWP 3887)]
__lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
The guest only has one vcpu, so CPU_FOREACH(cpu) is not much of a loop.
Looking down that path I tracked it to along:
cpu_synchronize_all_pre_loadvm -> cpu_synchronize_pre_loadvm -> kvm_cpu_synchronize_pre_loadvm -> do_run_on_cpu
Here it queues the function "do_kvm_cpu_synchronize_pre_loadvm" onto the vcpu.
That is done via queue_work_on_cpu(cpu, &wi); which in turn uses eventually "qemu_cpu_kick_thread(cpu);"
That seems to trigger the first SIGUSR1
Following that I get the breakpoint that I set at "do_kvm_cpu_synchronize_pre_loadvm".
The actual function only sets "cpu->vcpu_dirty = true;" and works.
On the way out from there, there is a "qemu_kvm_wait_io_event" which
leads to the next SIGUSR1.
Going on I see another "do_run_on_cpu" with "vapic_do_enable_tpr_reporting" as the function.
I set a breakpoint on that as well but took a full CPUstate before going on:
p *cpu
$4 = {parent_obj = {parent_obj = {class = 0x5ffe7170c0, free = 0x7f62328f15a0 <g_free>, properties = 0x5ffe736e40, ref = 1,
parent = 0x5ffe726160}, id = 0x0, realized = true, pending_deleted_event = false, opts = 0x0, hotplugged = 0, parent_bus = 0x0, gpios = {
lh_first = 0x0}, child_bus = {lh_first = 0x0}, num_child_bus = 0, instance_id_alias = -1, alias_required_for_version = 0}, nr_cores = 1,
nr_threads = 1, thread = 0x5ffe803cd0, thread_id = 8498, running = false, has_waiter = false, halt_cond = 0x5ffe803cf0, thread_kicked = true,
created = true, stop = false, stopped = true, unplug = false, crash_occurred = false, exit_request = false, interrupt_request = 4,
singlestep_enabled = 0, icount_budget = 0, icount_extra = 0, jmp_env = {{__jmpbuf = {0, 0, 0, 0, 0, 0, 0, 0}, __mask_was_saved = 0,
__saved_mask = {__val = {0 <repeats 16 times>}}}}, work_mutex = {lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0,
__kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0},
initialized = true}, queued_work_first = 0x5fffefc990, queued_work_last = 0x5fffefc990, cpu_ases = 0x5ffe803c10, num_ases = 1,
as = 0x5ffe7f9690, memory = 0x5ffe725bd0, env_ptr = 0x5ffe7e44c0, tb_jmp_cache = {0x0 <repeats 4096 times>}, gdb_regs = 0x0,
gdb_num_regs = 57, gdb_num_g_regs = 57, node = {tqe_next = 0x0, tqe_prev = 0x5ffc1783f0 <cpus>}, breakpoints = {tqh_first = 0x0,
tqh_last = 0x5ffe7e4430}, watchpoints = {tqh_first = 0x0, tqh_last = 0x5ffe7e4440}, watchpoint_hit = 0x0, opaque = 0x0, mem_io_pc = 0,
mem_io_vaddr = 0, kvm_fd = 19, kvm_state = 0x5ffe7357a0, kvm_run = 0x7f62374bc000, trace_dstate_delayed = {0}, trace_dstate = {0},
cpu_index = 0, halted = 1, can_do_io = 1, exception_index = -1, vcpu_dirty = true, throttle_thread_scheduled = false, icount_decr = {u32 = 0,
u16 = {low = 0, high = 0}}, hax_vcpu = 0x0, pending_tlb_flush = 7}
Continuing I hit the "vapic_do_enable_tpr_reporting" with qemu still
running.
Things go on, the next candidate for "do_run_on_cpu" is "kvm_apic_put"
Another SIGUSR1 to get that kicked it seems.
"kvm_apic_put" breakpoint is reached after that kick.
Next for "do_run_on_cpu" is "do_kvm_cpu_synchronize_post_init". And
that triggered the fourth SIGUSR1. Before I only saw four, hopefully
that is the same with so much breakpoints.
Checked the cpu state again:
1880 static void do_kvm_cpu_synchronize_post_init(CPUState *cpu, run_on_cpu_data arg)
1881 {
1882 kvm_arch_put_registers(cpu, KVM_PUT_FULL_STATE);
1883 cpu->vcpu_dirty = false;
1884 }
1885
(gdb) p cpu
$5 = (CPUState *) 0x5ffe7dc230
(gdb) p *cpu
$6 = {parent_obj = {parent_obj = {class = 0x5ffe7170c0, free = 0x7f62328f15a0 <g_free>, properties = 0x5ffe736e40, ref = 1,
parent = 0x5ffe726160}, id = 0x0, realized = true, pending_deleted_event = false, opts = 0x0, hotplugged = 0, parent_bus = 0x0, gpios = {
lh_first = 0x0}, child_bus = {lh_first = 0x0}, num_child_bus = 0, instance_id_alias = -1, alias_required_for_version = 0}, nr_cores = 1,
nr_threads = 1, thread = 0x5ffe803cd0, thread_id = 8498, running = false, has_waiter = false, halt_cond = 0x5ffe803cf0, thread_kicked = false,
created = true, stop = false, stopped = true, unplug = false, crash_occurred = false, exit_request = false, interrupt_request = 4,
singlestep_enabled = 0, icount_budget = 0, icount_extra = 0, jmp_env = {{__jmpbuf = {0, 0, 0, 0, 0, 0, 0, 0}, __mask_was_saved = 0,
__saved_mask = {__val = {0 <repeats 16 times>}}}}, work_mutex = {lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0,
__kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0},
initialized = true}, queued_work_first = 0x0, queued_work_last = 0x0, cpu_ases = 0x5ffe803c10, num_ases = 1, as = 0x5ffe7f9690,
memory = 0x5ffe725bd0, env_ptr = 0x5ffe7e44c0, tb_jmp_cache = {0x0 <repeats 4096 times>}, gdb_regs = 0x0, gdb_num_regs = 57,
gdb_num_g_regs = 57, node = {tqe_next = 0x0, tqe_prev = 0x5ffc1783f0 <cpus>}, breakpoints = {tqh_first = 0x0, tqh_last = 0x5ffe7e4430},
watchpoints = {tqh_first = 0x0, tqh_last = 0x5ffe7e4440}, watchpoint_hit = 0x0, opaque = 0x0, mem_io_pc = 0, mem_io_vaddr = 0, kvm_fd = 19,
kvm_state = 0x5ffe7357a0, kvm_run = 0x7f62374bc000, trace_dstate_delayed = {0}, trace_dstate = {0}, cpu_index = 0, halted = 1, can_do_io = 1,
exception_index = -1, vcpu_dirty = true, throttle_thread_scheduled = false, icount_decr = {u32 = 0, u16 = {low = 0, high = 0}},
hax_vcpu = 0x0, pending_tlb_flush = 7}
And from here stepping into kvm_arch_put_registers:
kvm_arch_put_registers (cpu=cpu@entry=0x5ffe7dc230, level=level@entry=3) at ./target/i386/kvm.c:2591
That still is the same vcpu as all the time, x86_cpu is optimized out
unfortunately as I had no full debug build with -O0.
I see it setting up regs in kvm_arch_put_registers without error (all ret=0) and return to do_kvm_cpu_synchronize_post_init.
This eventually sets "cpu->vcpu_dirty = false;"
After this seems all good I steped through the "way out" and there came another "qemu_kvm_wait_io_event(cpu);".
Without considering this being critical I stepped with "n" and qemu was gone with all its threads.
qemu_kvm_cpu_thread_fn (arg=0x5ffe7dc230) at ./cpus.c:1134
1134 } while (!cpu->unplug || cpu_can_run(cpu));
(gdb) n
1127 if (cpu_can_run(cpu)) {
(gdb) n
1133 qemu_kvm_wait_io_event(cpu);
(gdb) n
[Thread 0x7f6227857700 (LWP 8498) exited]
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (11 preceding siblings ...)
2017-08-21 14:55 ` ChristianEhrhardt
@ 2017-08-21 14:56 ` ChristianEhrhardt
2017-08-21 15:46 ` Dr. David Alan Gilbert
` (35 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: ChristianEhrhardt @ 2017-08-21 14:56 UTC (permalink / raw)
To: qemu-devel
After this I was trying to start closer to the issue, so I put a break on "process_incoming_migration_co" (to skip over much of the initial setup).
Once that was hit I added "qemu_kvm_cpu_thread_fn" and "qemu_kvm_wait_io_event".
Of course when I try that the other functions do not trigger.
Maybe it is partially influenced by the debugging itself and/or the timing changes it causes.
I'll check what else I can find with slightly different debugging, but
so much as an update for now.
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (12 preceding siblings ...)
2017-08-21 14:56 ` ChristianEhrhardt
@ 2017-08-21 15:46 ` Dr. David Alan Gilbert
2017-08-22 8:51 ` ChristianEhrhardt
` (34 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: Dr. David Alan Gilbert @ 2017-08-21 15:46 UTC (permalink / raw)
To: qemu-devel
oh yeh you want to tell gdb to ignore SIGUSR1, something like:
handle SIGUSR1 nostop noprint pass
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (13 preceding siblings ...)
2017-08-21 15:46 ` Dr. David Alan Gilbert
@ 2017-08-22 8:51 ` ChristianEhrhardt
2017-08-22 8:54 ` ChristianEhrhardt
` (33 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: ChristianEhrhardt @ 2017-08-22 8:51 UTC (permalink / raw)
To: qemu-devel
Sure, but initially I wanted to see what is going on overall so I let it
pop up.
Started a debugging another session today.
First I confirmed with
(gdb) catch syscall exit exit_group
That this is the "normal" exit along the error message we knew:
migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
MIGRATION_STATUS_FAILED);
error_report("load of migration failed: %s", strerror(-ret));
qemu_fclose(mis->from_src_file);
exit(EXIT_FAILURE);
I found that already the retval of qemu_loadvm_state it -5.
Every thing else afterwards is cleanup.
Inside qemu_loadvm_state the first 2/3 pass and then that ret=-5 is from
"ret = qemu_file_get_error(f);".
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (14 preceding siblings ...)
2017-08-22 8:51 ` ChristianEhrhardt
@ 2017-08-22 8:54 ` ChristianEhrhardt
2017-08-22 8:58 ` ChristianEhrhardt
` (32 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: ChristianEhrhardt @ 2017-08-22 8:54 UTC (permalink / raw)
To: qemu-devel
Via a watchpoints I found that the error is set by qemu_fill_buffer.
b qemu_loadvm_state
handle SIGUSR1 nostop noprint pass
c
# on the break check and watch the status
(gdb) p f
$1 = (QEMUFile *) 0xb9babb3c00
(gdb) p *f
$2 = {ops = 0xb9b89880a0 <channel_input_ops>, hooks = 0x0, opaque = 0xb9bbabfe00, bytes_xfer = 0, xfer_limit = 0, pos = 0, buf_index = 0,
buf_size = 0, buf = '\000' <repeats 32767 times>, may_free = {0}, iov = {{iov_base = 0x0, iov_len = 0} <repeats 64 times>}, iovcnt = 0,
last_error = 0}
# ok still no err, set watchpoint
(gdb) p &(f->last_error)
$4 = (int *) 0xb9babbc044
(gdb) watch *(int *) 0xb9babbc044
Hardware watchpoint 2: *(int *) 0xb9babbc044
# This catches the following
Thread 1 "qemu-system-x86" hit Hardware watchpoint 2: *(int *) 0xb9babbc044
Old value = 0
New value = -5
0x000000b9b82bd0ec in qemu_file_set_error (ret=-5, f=0xb9babb3c00) at ./migration/qemu-file.c:125
warning: Source file is more recent than executable.
125 f->last_error = ret;
(gdb) bt
#0 0x000000b9b82bd0ec in qemu_file_set_error (ret=-5, f=0xb9babb3c00) at ./migration/qemu-file.c:125
#1 qemu_fill_buffer (f=0xb9babb3c00) at ./migration/qemu-file.c:299
#2 0x000000b9b82bdbb1 in qemu_peek_byte (f=0xb9babb3c00, offset=0) at ./migration/qemu-file.c:553
#3 0x000000b9b82bdc1b in qemu_get_byte (f=f@entry=0xb9babb3c00) at ./migration/qemu-file.c:566
#4 0x000000b9b82b5853 in qemu_loadvm_state_main (f=f@entry=0xb9babb3c00, mis=0xb9b8a4f700 <mis_current>) at ./migration/savevm.c:1947
#5 0x000000b9b82b864f in qemu_loadvm_state (f=f@entry=0xb9babb3c00) at ./migration/savevm.c:2032
#6 0x000000b9b82af5c3 in process_incoming_migration_co (opaque=0xb9babb3c00) at ./migration/migration.c:320
#7 0x000000b9b83e42a6 in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at ./util/coroutine-ucontext.c:79
#8 0x00007fbf3702fac0 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#9 0x00007fffe3f9f800 in ?? ()
#10 0x0000000000000000 in ?? ()
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (15 preceding siblings ...)
2017-08-22 8:54 ` ChristianEhrhardt
@ 2017-08-22 8:58 ` ChristianEhrhardt
2017-08-22 9:01 ` ChristianEhrhardt
` (31 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: ChristianEhrhardt @ 2017-08-22 8:58 UTC (permalink / raw)
To: qemu-devel
So this is failing I/O that iterates over a channel.
I was tracking down the len, pending and pos used.
I found that this is not completely broken (like no access or generla I/O error)
It starts at pos 0 and iterated with varying offsets, but works for quite some time.
Example:
[...]
Thread 1 "qemu-system-x86" hit Breakpoint 2, qemu_fill_buffer (f=f@entry=0xd3b66f3c00) at ./migration/qemu-file.c:295
295 if (len > 0) {
$11183 = 28728
$11184 = 4040
$11185 = {ops = 0xd3b3d740a0 <channel_input_ops>, hooks = 0x0, opaque = 0xd3b75ee490, bytes_xfer = 0, xfer_limit = 0, pos = 107130146,
buf_index = 0, buf_size = 4040,
buf = "\v\327\a\000\021\000\[...]\000"...,
may_free = {0}, iov = {{iov_base = 0x0, iov_len = 0} <repeats 64 times>}, iovcnt = 0, last_error = 0}
[...]
Well you could see the whole file read passing by one by one buffer
Yet this isn't particularly fast, so track the one that has len==0
(gdb) b ./migration/qemu-file.c:295 if len == 0
And I got it as:
(gdb) p *f
$11195 = {ops = 0xd3b3d740a0 <channel_input_ops>, hooks = 0x0, opaque = 0xd3b75ee490, bytes_xfer = 0, xfer_limit = 0, pos = 319638837,
buf_index = 0, buf_size = 0, buf = '\000' <repeats 5504 times>..., may_free = {0}, iov = {{iov_base = 0x0, iov_len = 0} <repeats 64 times>},
iovcnt = 0, last_error = 0}
Here pending == 0 so buf_size = 0 as well also pos is further down
incremented to 319638837.
Checking in detail I found that I had pending=0 and buf_size=0 as well as non aligned pos entried, but they worked.
So I excluded the buf_size=0/pending=0 as well as the alignment as reasons.
Maybe it just iterates pos out of the range that is working?
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (16 preceding siblings ...)
2017-08-22 8:58 ` ChristianEhrhardt
@ 2017-08-22 9:01 ` ChristianEhrhardt
2017-08-22 9:08 ` ChristianEhrhardt
` (30 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: ChristianEhrhardt @ 2017-08-22 9:01 UTC (permalink / raw)
To: qemu-devel
(gdb) handle SIGUSR1 nostop noprint pass
(gdb) b migration/qemu-file.c:295
(gdb) command
p f->pos
c
end
That showed the pos is ever increasing and fails at an offset it never read before. Yet the absolute number was different.
$1 = 0
$2 = 8948
$3 = 41423
[...]
$11359 = 326387440
$11360 = 326420208 => This was the one failing this time
This was a different f->pos than last time, so I wondered if this would change every time.
With a less interactive gdb config I got in three tries:
1. 313153311
2. 313313376
3. 313571856
So a different f->pos to fail each time.
Different but rather close.
I wondered if the reasons I got a higher one when tracing in more detail printing all offsets could be that there still is something copied/synced and only slowly gets available.
I stepped through rather slowly and got to 322429260 this time.
So slower continuing on the iteration over qemu_fill_buffer makes it fail "later"?
Finally it is surely interesting which channel that actually is- likely the migration socket?
And yes, ioc->name in qio_channel_read is:
$8 = 0x56ab78e5c0 "migration-socket-incoming
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (17 preceding siblings ...)
2017-08-22 9:01 ` ChristianEhrhardt
@ 2017-08-22 9:08 ` ChristianEhrhardt
2017-08-22 9:26 ` Dr. David Alan Gilbert
` (29 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: ChristianEhrhardt @ 2017-08-22 9:08 UTC (permalink / raw)
To: qemu-devel
So TL;DR summary for now:
- error triggers in qio_channel_read
- file is migration-socket-incoming
- reads work a while, but then fail at high f->pos offsets (slightly different ones each time)
- slower execution seems to lead to slightly higher offsets that are failing
- only happens on --copy-storage-* migrations (libvirt/virsh argument)
I don't really know atm where to look deeper - is there a good side
channel that I could use to look at what is going on on the migration-
socket-incoming - Maybe from the source and target while I block in gdb?
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (18 preceding siblings ...)
2017-08-22 9:08 ` ChristianEhrhardt
@ 2017-08-22 9:26 ` Dr. David Alan Gilbert
2017-08-22 9:51 ` Dr. David Alan Gilbert
` (28 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: Dr. David Alan Gilbert @ 2017-08-22 9:26 UTC (permalink / raw)
To: qemu-devel
OK, so that looks like a real case of the migration stream failing and getting an IO error; so the question is why:
a) Is the source qemu dieing first and closing the socket?
b) Is libvirt closing the socket for some reason
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (19 preceding siblings ...)
2017-08-22 9:26 ` Dr. David Alan Gilbert
@ 2017-08-22 9:51 ` Dr. David Alan Gilbert
2017-08-22 10:20 ` ChristianEhrhardt
` (27 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: Dr. David Alan Gilbert @ 2017-08-22 9:51 UTC (permalink / raw)
To: qemu-devel
also, you might want to chase it a bit further down, I think we've got:
qemu-file-channel.c:channel_get_buffer
io/channel-socket.c or io/channel-file.c qio_channel_file_readv
it would be good to know what the readv/readmsg is actually
returning in the case where it's failing.
Dave
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (20 preceding siblings ...)
2017-08-22 9:51 ` Dr. David Alan Gilbert
@ 2017-08-22 10:20 ` ChristianEhrhardt
2017-08-22 10:57 ` Dr. David Alan Gilbert
` (26 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: ChristianEhrhardt @ 2017-08-22 10:20 UTC (permalink / raw)
To: qemu-devel
I'll track down the actual read and then add debugging the source at the same time (that should be the best way to track the migration socket on both sides).
This might be slightly tricky since I don't know exactly on which offset but I can surely start over 310*10^6 it seems.
I'll report back once I know more, thanks for your guidance David
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (21 preceding siblings ...)
2017-08-22 10:20 ` ChristianEhrhardt
@ 2017-08-22 10:57 ` Dr. David Alan Gilbert
2017-08-22 11:13 ` Dr. David Alan Gilbert
` (25 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: Dr. David Alan Gilbert @ 2017-08-22 10:57 UTC (permalink / raw)
To: qemu-devel
Hmm i just tried to reproduce this and hit (on the source):
main_channel_client_handle_migrate_connected: client 0x5607d785f610 connected: 0 seamless 0
qemu-system-x86_64: /root/qemu/io/channel.c:303: qio_channel_yield: Assertion `!ioc->write_coroutine' failed.
2017-08-22 10:50:04.888+0000: shutting down, reason=crashed
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (22 preceding siblings ...)
2017-08-22 10:57 ` Dr. David Alan Gilbert
@ 2017-08-22 11:13 ` Dr. David Alan Gilbert
2017-08-22 11:14 ` ChristianEhrhardt
` (24 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: Dr. David Alan Gilbert @ 2017-08-22 11:13 UTC (permalink / raw)
To: qemu-devel
OK, 3rd try and I've hit the same behaviour as Christian.
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (23 preceding siblings ...)
2017-08-22 11:13 ` Dr. David Alan Gilbert
@ 2017-08-22 11:14 ` ChristianEhrhardt
2017-08-22 11:18 ` ChristianEhrhardt
` (23 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: ChristianEhrhardt @ 2017-08-22 11:14 UTC (permalink / raw)
To: qemu-devel
Stack from qemu_fill_buffer to qio_channel_socket_readv
#0 qio_channel_socket_readv (ioc=<optimized out>, iov=<optimized out>, niov=<optimized out>, fds=0x0, nfds=0x0, errp=0x0)
at ./io/channel-socket.c:477
#1 0x0000001486ec97e2 in qio_channel_read (ioc=ioc@entry=0x148a73a6c0,
buf=buf@entry=\060\nLw", buflen=buflen@entry=28728, errp=errp@entry=0x0) at ./io/channel.c:112
#2 0x0000001486e005ec in channel_get_buffer (opaque=<optimized out>,
buf=0x1489844c00 "\060\nLw", pos=<optimized out>, size=28728) at ./migration/qemu-file-channel.c:80
#3 0x0000001486dff095 in qemu_fill_buffer (f=f@entry=0x1489843c00) at ./migration/qemu-file.c:293
I checked that sioc->fd, &msg, sflags) is in fact the socket.
With e.g. with this fd being 27
tcp ESTAB 1405050 0 ::ffff:10.22.69.30:49152 ::ffff:10.22.69.157:49804 users:(("qemu-system-x86",pid=29273,fd=27)) ino:3345152 sk:30 <->
skmem:(r1420644,rb1495660,t0,tb332800,f668,w0,o0,bl0,d14) ts sack cubic wscale:7,7 rto:200 rtt:0.04/0.02 ato:80 mss:8948 cwnd:10 bytes_received:1981460 segs_out:37 segs_in:247 data_segs_in:231 send 17896.0Mbps lastsnd:254728 lastrcv:250372 lastack:250372 rcv_rtt:0.205 rcv_space:115461 minrtt:0.04
I need to break on the fail of that recvmsg in qio_channel_socket_readv
# the following does not work due to optimization the ret value is only around later
b io/channel-socket.c:478 if ret < 0
But catching it "inside" the if works
b io/channel-socket.c:479
Take the following with a grain of salt, this is very threaded and noisy to debug.
Once I hit it the recmsg returned "-1", that was on f->pos = 311641887
But at the same time I could confirm (via ss) that the socket itself is still open on source and target of the migration.
-1 is EAGAIN and returns QIO_CHANNEL_ERR_BLOCK
That seems to arrive in nbd_rwv nbd/common.c:44).
And led to "qio_channel_yield"
There are a few corouting switches in between so I hope I'm not loosing anything.
But that first ret<0 actually worked, it seems the yield and retry got it working.
I got back to qemu_fill_buffer iterating further after this.
This hit ret<0 in qio_channel_socket_readv again at f->pos 311641887.
This time on returning the QIO_CHANNEL_ERR_BLOCK it returned to "./migration/qemu-file-channel.c:81".
That was interesting as it is different than before.
After this it seemed to become a death spiral - recmsg returned -1 every time (still on the same offset).
It passed back through the nbd_rwv which called qio_channel_yield for multiple times.
Then it continued and later on on 321998304 is the last I saw.
It did no more pass b io/channel-socket.c:479 at all, but then led to the exit.
Hmm, I might have lost myself on the coroutine switches - but it is odd at least.
Trying to redo less interactive and with a bit more prep ...
Maybe the results are more reliable then ...
Getting back with more later ...
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (24 preceding siblings ...)
2017-08-22 11:14 ` ChristianEhrhardt
@ 2017-08-22 11:18 ` ChristianEhrhardt
2017-08-22 11:23 ` Dr. David Alan Gilbert
` (22 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: ChristianEhrhardt @ 2017-08-22 11:18 UTC (permalink / raw)
To: qemu-devel
Only now read comment #27, thanks David for reproducing with me, it is
somewhat relieving that you seem to see the same.
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (25 preceding siblings ...)
2017-08-22 11:18 ` ChristianEhrhardt
@ 2017-08-22 11:23 ` Dr. David Alan Gilbert
2017-08-22 11:33 ` ChristianEhrhardt
` (21 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: Dr. David Alan Gilbert @ 2017-08-22 11:23 UTC (permalink / raw)
To: qemu-devel
(4th try) breakpoint on qemu_file_set_error, it's bdrv_inactivate_all
that's returning the error.
(gdb) list
1155 if (inactivate_disks) {
1156 /* Inactivate before sending QEMU_VM_EOF so that the
1157 * bdrv_invalidate_cache_all() on the other end won't fail. */
1158 ret = bdrv_inactivate_all();
1159 if (ret) {
1160 qemu_file_set_error(f, ret);
1161 return ret;
1162 }
1163 }
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (26 preceding siblings ...)
2017-08-22 11:23 ` Dr. David Alan Gilbert
@ 2017-08-22 11:33 ` ChristianEhrhardt
2017-08-22 11:44 ` Dr. David Alan Gilbert
` (20 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: ChristianEhrhardt @ 2017-08-22 11:33 UTC (permalink / raw)
To: qemu-devel
For me qemu_file_set_error was always called from qemu_fill_buffer, interesting that it seems different for you.
I'll rerun a few times to ensure it really always is always from qemu_fill_buffer for me.
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (27 preceding siblings ...)
2017-08-22 11:33 ` ChristianEhrhardt
@ 2017-08-22 11:44 ` Dr. David Alan Gilbert
2017-08-22 11:46 ` Dr. David Alan Gilbert
` (19 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: Dr. David Alan Gilbert @ 2017-08-22 11:44 UTC (permalink / raw)
To: qemu-devel
The difference with the qemu_file_set_error is I'm looking on the source
- because what's happening is the source is erroring so closing the
socket, and so the error you're seeing on the destination is real - the
socket just EOF'd!
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (28 preceding siblings ...)
2017-08-22 11:44 ` Dr. David Alan Gilbert
@ 2017-08-22 11:46 ` Dr. David Alan Gilbert
2017-08-22 11:50 ` ChristianEhrhardt
` (18 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: Dr. David Alan Gilbert @ 2017-08-22 11:46 UTC (permalink / raw)
To: qemu-devel
repeated the assert in #26:
Program received signal SIGABRT, Aborted.
0x00007f02163005f7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56 return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
(gdb) where
#0 0x00007f02163005f7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007f0216301ce8 in __GI_abort () at abort.c:90
#2 0x00007f02162f9566 in __assert_fail_base (fmt=0x7f0216449288 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x560ac0191e93 "!ioc->write_coroutine", file=file@entry=0x560ac0191e66 "/root/qemu/io/channel.c", line=line@entry=303, function=function@entry=0x560ac0191f60 <__PRETTY_FUNCTION__.22239> "qio_channel_yield")
at assert.c:92
#3 0x00007f02162f9612 in __GI___assert_fail (assertion=assertion@entry=0x560ac0191e93 "!ioc->write_coroutine", file=file@entry=0x560ac0191e66 "/root/qemu/io/channel.c", line=line@entry=303, function=function@entry=0x560ac0191f60 <__PRETTY_FUNCTION__.22239> "qio_channel_yield") at assert.c:101
#4 0x0000560ac0036a08 in qio_channel_yield (ioc=ioc@entry=0x560ac2397c90, condition=condition@entry=G_IO_OUT)
at /root/qemu/io/channel.c:303
#5 0x0000560ac001930e in nbd_rwv (ioc=0x560ac2397c90, iov=<optimized out>, niov=<optimized out>, length=<optimized out>, do_read=do_read@entry=false, errp=errp@entry=0x0) at /root/qemu/nbd/common.c:47
#6 0x0000560ac0007e24 in nbd_co_send_request (bs=bs@entry=0x560ac30167a0, request=request@entry=0x7f0209afc9a0, qiov=qiov@entry=0x560ac2428d68) at /root/qemu/block/nbd-client.c:154
#7 0x0000560ac0008244 in nbd_client_co_pwritev (bs=0x560ac30167a0, offset=3414163456, bytes=<optimized out>, qiov=0x560ac2428d68, flags=<optimized out>) at /root/qemu/block/nbd-client.c:260
#8 0x0000560ac00030e1 in bdrv_driver_pwritev (bs=bs@entry=0x560ac30167a0, offset=offset@entry=3414163456, bytes=bytes@entry=589824, qiov=qiov@entry=0x560ac2428d68, flags=flags@entry=0) at /root/qemu/block/io.c:877
#9 0x0000560ac0004480 in bdrv_aligned_pwritev (req=req@entry=0x7f0209afcba0, offset=offset@entry=3414163456, bytes=589824, align=align@entry=1, qiov=qiov@entry=0x560ac2428d68, flags=flags@entry=0, child=0x560ac1f0a9b0, child=0x560ac1f0a9b0) at /root/qemu/block/io.c:1382
#10 0x0000560ac0005258 in bdrv_co_pwritev (child=0x560ac1f0a9b0, offset=offset@entry=3414163456, bytes=<optimized out>, qiov=qiov@entry=0x560ac2428d68, flags=0) at /root/qemu/block/io.c:1633
#11 0x0000560abffbf564 in raw_co_pwritev (bs=0x560ac22807f0, offset=3414163456, bytes=<optimized out>, qiov=0x560ac2428d68, flags=<optimized out>) at /root/qemu/block/raw-format.c:243
#12 0x0000560ac00030e1 in bdrv_driver_pwritev (bs=bs@entry=0x560ac22807f0, offset=offset@entry=3414163456, bytes=bytes@entry=589824, qiov=qiov@entry=0x560ac2428d68, flags=flags@entry=0) at /root/qemu/block/io.c:877
#13 0x0000560ac0004480 in bdrv_aligned_pwritev (req=req@entry=0x7f0209afce70, offset=offset@entry=3414163456, bytes=589824, align=align@entry=1, qiov=qiov@entry=0x560ac2428d68, flags=flags@entry=0, child=0x560ac33c1e70, child=0x560ac33c1e70) at /root/qemu/block/io.c:1382
#14 0x0000560ac0005258 in bdrv_co_pwritev (child=0x560ac33c1e70, offset=offset@entry=3414163456, bytes=<optimized out>, bytes@entry=589824, qiov=qiov@entry=0x560ac2428d68, flags=0) at /root/qemu/block/io.c:1633
#15 0x0000560abfff5173 in blk_co_pwritev (blk=0x560ac14f31e0, offset=3414163456, bytes=589824, qiov=0x560ac2428d68, flags=<optimized out>) at /root/qemu/block/block-backend.c:1062
#16 0x0000560abfff528a in blk_aio_write_entry (opaque=0x560ac2c18f70) at /root/qemu/block/block-backend.c:1253
#17 0x0000560ac0092cca in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>)
at /root/qemu/util/coroutine-ucontext.c:79
#18 0x00007f0216312110 in __start_context () at /lib64/libc.so.6
#19 0x00007fff12c4db30 in ()
#20 0x0000000000000000 in ()
(gdb)
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (29 preceding siblings ...)
2017-08-22 11:46 ` Dr. David Alan Gilbert
@ 2017-08-22 11:50 ` ChristianEhrhardt
2017-08-22 15:05 ` Dr. David Alan Gilbert
` (17 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: ChristianEhrhardt @ 2017-08-22 11:50 UTC (permalink / raw)
To: qemu-devel
In 5/5 tries this was on qemu_fill_buffer for my case.
But that was on the receiving side, and what you found is closer to the root cause on the source of the migration.
I checked on qemu_file_set_error on the source and can confirm your finding that on the source it is from bdrv_inactivate_all.
#0 qemu_file_set_error (f=f@entry=0x6b76b46c00, ret=ret@entry=-1) at ./migration/qemu-file.c:124
#1 0x0000006b727140cb in qemu_savevm_state_complete_precopy (f=0x6b76b46c00, iterable_only=iterable_only@entry=false,
inactivate_disks=inactivate_disks@entry=true) at ./migration/savevm.c:1160
#2 0x0000006b7270c84b in migration_completion (start_time=<synthetic pointer>, old_vm_running=<synthetic pointer>, current_active_state=4,
s=0x6b74ef53b0) at ./migration/migration.c:1858
#3 migration_thread (opaque=0x6b74ef53b0) at ./migration/migration.c:2023
#4 0x00007f61a740e74a in start_thread (arg=0x7f61467fc700) at pthread_create.c:456
#5 0x00007f61a714acaf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
Also as I outlined - what seems ages ago in comment #6 - if the source
is a qemu 2.8 the migration works for me which would kind of match
assuming the root cause is in the source.
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (30 preceding siblings ...)
2017-08-22 11:50 ` ChristianEhrhardt
@ 2017-08-22 15:05 ` Dr. David Alan Gilbert
2017-08-22 15:47 ` ChristianEhrhardt
` (16 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: Dr. David Alan Gilbert @ 2017-08-22 15:05 UTC (permalink / raw)
To: qemu-devel
OK, Stefan posted a patch for that assert (see 'nbd-client: avoid spurious qui_channel_yield() re-entry) so now I'm running with the following patch and I'm seeing the bdrv_inactivate return a -1 for
drive-virtio-disk0
Christian: Could you see what your source says with this patch?
diff --git a/block.c b/block.c
index 3615a68..f9bd689 100644
--- a/block.c
+++ b/block.c
@@ -4078,9 +4078,11 @@ static int bdrv_inactivate_recurse(BlockDriverState *bs,
BdrvChild *child, *parent;
int ret;
+ fprintf(stderr, "%s: entry for %s\n", __func__, bdrv_get_device_or_node_name(bs));
if (!setting_flag && bs->drv->bdrv_inactivate) {
ret = bs->drv->bdrv_inactivate(bs);
if (ret < 0) {
+ fprintf(stderr, "%s: exit 1(%d) for %s\n", __func__, ret, bdrv_get_device_or_node_name(bs));
return ret;
}
}
@@ -4094,6 +4096,7 @@ static int bdrv_inactivate_recurse(BlockDriverState *bs,
if (parent->role->inactivate) {
ret = parent->role->inactivate(parent);
if (ret < 0) {
+ fprintf(stderr, "%s: exit 2(%d) for %s\n", __func__, ret, bdrv_get_device_or_node_name(bs));
bs->open_flags &= ~BDRV_O_INACTIVE;
return ret;
}
@@ -4109,6 +4112,7 @@ static int bdrv_inactivate_recurse(BlockDriverState *bs,
QLIST_FOREACH(child, &bs->children, next) {
ret = bdrv_inactivate_recurse(child->bs, setting_flag);
if (ret < 0) {
+ fprintf(stderr, "%s: exit 3(%d) for %s\n", __func__, ret, bdrv_get_device_or_node_name(bs));
return ret;
}
}
@@ -4117,6 +4121,7 @@ static int bdrv_inactivate_recurse(BlockDriverState *bs,
* driver */
bdrv_release_persistent_dirty_bitmaps(bs);
+ fprintf(stderr, "%s: exit end good for %s\n", __func__, bdrv_get_device_or_node_name(bs));
return 0;
}
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (31 preceding siblings ...)
2017-08-22 15:05 ` Dr. David Alan Gilbert
@ 2017-08-22 15:47 ` ChristianEhrhardt
2017-08-22 17:57 ` ChristianEhrhardt
` (15 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: ChristianEhrhardt @ 2017-08-22 15:47 UTC (permalink / raw)
To: qemu-devel
Building with the attached debug patch ...
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (32 preceding siblings ...)
2017-08-22 15:47 ` ChristianEhrhardt
@ 2017-08-22 17:57 ` ChristianEhrhardt
2017-08-22 19:21 ` ChristianEhrhardt
` (14 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: ChristianEhrhardt @ 2017-08-22 17:57 UTC (permalink / raw)
To: qemu-devel
I didn't add Stefans patch yet.
Note: the Mentioned patch is at: Note: http://lists.nongnu.org/archive/html/qemu-devel/2017-08/msg04027.html
With your debug patch applied I get:
2017-08-22 17:57:04.486+0000: initiating migration
bdrv_inactivate_recurse: entry for drive-virtio-disk0
bdrv_inactivate_recurse: entry for #block145
bdrv_inactivate_recurse: entry for #block328
bdrv_inactivate_recurse: entry for #block210
bdrv_inactivate_recurse: exit end good for #block210
bdrv_inactivate_recurse: exit end good for #block328
bdrv_inactivate_recurse: entry for #block082
bdrv_inactivate_recurse: exit end good for #block082
bdrv_inactivate_recurse: exit end good for #block145
bdrv_inactivate_recurse: exit end good for drive-virtio-disk0
bdrv_inactivate_recurse: entry for #block763
bdrv_inactivate_recurse: entry for #block631
bdrv_inactivate_recurse: exit end good for #block631
bdrv_inactivate_recurse: exit end good for #block763
bdrv_inactivate_recurse: entry for drive-virtio-disk1
bdrv_inactivate_recurse: entry for #block544
bdrv_inactivate_recurse: entry for #block405
bdrv_inactivate_recurse: exit end good for #block405
bdrv_inactivate_recurse: exit end good for #block544
bdrv_inactivate_recurse: exit end good for drive-virtio-disk1
bdrv_inactivate_recurse: entry for #block1086
bdrv_inactivate_recurse: entry for #block919
bdrv_inactivate_recurse: exit end good for #block919
bdrv_inactivate_recurse: exit end good for #block1086
bdrv_inactivate_recurse: entry for drive-virtio-disk0
bdrv_inactivate_recurse: exit 2(-1) for drive-virtio-disk0
I'm currently building one with Stefans patch applied as well over (my)
night, but let me know if there is more that makes sense to try.
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (33 preceding siblings ...)
2017-08-22 17:57 ` ChristianEhrhardt
@ 2017-08-22 19:21 ` ChristianEhrhardt
2017-08-23 8:02 ` Dr. David Alan Gilbert
` (13 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: ChristianEhrhardt @ 2017-08-22 19:21 UTC (permalink / raw)
To: qemu-devel
With the patch from Stefan and your debug applied source and target I still run into the same issue I'd say.
Id's are slightly off, but they are different on every try anyway.
Still looks the same for me:
bdrv_inactivate_recurse: entry for drive-virtio-disk0
bdrv_inactivate_recurse: entry for #block184
bdrv_inactivate_recurse: entry for #block319
bdrv_inactivate_recurse: entry for #block218
bdrv_inactivate_recurse: exit end good for #block218
bdrv_inactivate_recurse: exit end good for #block319
bdrv_inactivate_recurse: entry for #block092
bdrv_inactivate_recurse: exit end good for #block092
bdrv_inactivate_recurse: exit end good for #block184
bdrv_inactivate_recurse: exit end good for drive-virtio-disk0
bdrv_inactivate_recurse: entry for #block1905
bdrv_inactivate_recurse: entry for #block1889
bdrv_inactivate_recurse: exit end good for #block1889
bdrv_inactivate_recurse: exit end good for #block1905
bdrv_inactivate_recurse: entry for drive-virtio-disk1
bdrv_inactivate_recurse: entry for #block551
bdrv_inactivate_recurse: entry for #block423
bdrv_inactivate_recurse: exit end good for #block423
bdrv_inactivate_recurse: exit end good for #block551
bdrv_inactivate_recurse: exit end good for drive-virtio-disk1
bdrv_inactivate_recurse: entry for #block2246
bdrv_inactivate_recurse: entry for #block2106
bdrv_inactivate_recurse: exit end good for #block2106
bdrv_inactivate_recurse: exit end good for #block2246
bdrv_inactivate_recurse: entry for drive-virtio-disk0
bdrv_inactivate_recurse: exit 2(-1) for drive-virtio-disk0
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (34 preceding siblings ...)
2017-08-22 19:21 ` ChristianEhrhardt
@ 2017-08-23 8:02 ` Dr. David Alan Gilbert
2017-08-23 14:14 ` Stefan Hajnoczi
` (12 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: Dr. David Alan Gilbert @ 2017-08-23 8:02 UTC (permalink / raw)
To: qemu-devel
OK, yeh that's the same symptom I saw - it's that final failure that
causes bdrv_inactivate_all to return a failure and fail the source
migration.
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (35 preceding siblings ...)
2017-08-23 8:02 ` Dr. David Alan Gilbert
@ 2017-08-23 14:14 ` Stefan Hajnoczi
2017-08-23 14:41 ` Dr. David Alan Gilbert
` (11 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: Stefan Hajnoczi @ 2017-08-23 14:14 UTC (permalink / raw)
To: qemu-devel
Please see Fam's patch series "[PATCH for-2.10 0/4] block: Fix non-
shared storage migration" that fixes this issue.
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (36 preceding siblings ...)
2017-08-23 14:14 ` Stefan Hajnoczi
@ 2017-08-23 14:41 ` Dr. David Alan Gilbert
2017-08-23 14:46 ` ChristianEhrhardt
` (10 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: Dr. David Alan Gilbert @ 2017-08-23 14:41 UTC (permalink / raw)
To: qemu-devel
yes, seems to fix it for me.
Thanks Christian for filing this; we probably wouldn't have spotted it before the release without it
(which the test Stefan has just added will hopefully cure!).
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (37 preceding siblings ...)
2017-08-23 14:41 ` Dr. David Alan Gilbert
@ 2017-08-23 14:46 ` ChristianEhrhardt
2017-08-23 14:55 ` ChristianEhrhardt
` (9 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: ChristianEhrhardt @ 2017-08-23 14:46 UTC (permalink / raw)
To: qemu-devel
Hi Stefan,
I was part of the report around the series in "[PATCH for-2.10 0/4] block: Fix non-shared storage migration", but this is happening on rc3 which contains this.
AFAIK Fam's series is:
dd7fdaad iotests: Add non-shared storage migration case 192 (Fam)
5f7772c4 block-backend: Defer shared_perm tightening migration completion (Fam)
3dff24f2 nbd: Fix order of bdrv_set_perm and bdrv_invalidate_cache (Kevin)
80adf54e stubs: Add vm state change handler stubs (Fam)
All these got into v2.10.0-rc3 which these tests are based on already.
IMHO - This is not complete for qemu 2.10 and a regression since 2.9 (well since 2.8 as I haven't tested 2.9 personally).
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (38 preceding siblings ...)
2017-08-23 14:46 ` ChristianEhrhardt
@ 2017-08-23 14:55 ` ChristianEhrhardt
2017-08-23 15:13 ` Eric Blake
2017-08-23 16:56 ` ChristianEhrhardt
` (8 subsequent siblings)
48 siblings, 1 reply; 50+ messages in thread
From: ChristianEhrhardt @ 2017-08-23 14:55 UTC (permalink / raw)
To: qemu-devel
Ok, clarified with Stefanha
It has exactly the same title as a series of 18th August which was related to a similar issue.
It is about an hour old now on qemu-devel, quoting
"This fixes the issue reported as
https://bugs.launchpad.net/bugs/1711602
Fam Zheng (3):
block-backend: Refactor inactivate check
block-backend: Allow more "can inactivate" cases
mirror: Mark target BB as "force allow inactivate"
Stefan Hajnoczi (1):
block: Update open_flags after ->inactivate() callback"
I'll prep a build with that and test as well
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
2017-08-23 14:55 ` ChristianEhrhardt
@ 2017-08-23 15:13 ` Eric Blake
0 siblings, 0 replies; 50+ messages in thread
From: Eric Blake @ 2017-08-23 15:13 UTC (permalink / raw)
To: Bug 1711602, qemu-devel
[-- Attachment #1: Type: text/plain, Size: 993 bytes --]
On 08/23/2017 09:55 AM, ChristianEhrhardt wrote:
> Ok, clarified with Stefanha
> It has exactly the same title as a series of 18th August which was related to a similar issue.
> It is about an hour old now on qemu-devel, quoting
>
> "This fixes the issue reported as
> https://bugs.launchpad.net/bugs/1711602
>
> Fam Zheng (3):
> block-backend: Refactor inactivate check
> block-backend: Allow more "can inactivate" cases
> mirror: Mark target BB as "force allow inactivate"
>
> Stefan Hajnoczi (1):
> block: Update open_flags after ->inactivate() callback"
>
>
> I'll prep a build with that and test as well
Here's what is brewing for my pull request, although if you can
successfully test things, I'm happy to add a Tested-by: tag before
actually sending the pull request:
git fetch git://repo.or.cz/qemu/ericb.git nbd
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 619 bytes --]
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (39 preceding siblings ...)
2017-08-23 14:55 ` ChristianEhrhardt
@ 2017-08-23 16:56 ` ChristianEhrhardt
2017-08-23 17:01 ` ChristianEhrhardt
` (7 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: ChristianEhrhardt @ 2017-08-23 16:56 UTC (permalink / raw)
To: qemu-devel
Hmm,
it gets further but can still not complete this kind of migration:
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal
qemu+ssh://10.22.69.30/system
Source:
2017-08-23 16:49:23.022+0000: initiating migration
Unexpected error in bdrv_check_perm() at /build/qemu-VjSgVJ/qemu-2.10~rc3+dfsg/block.c:1574:
2017-08-23T16:49:23.203181Z qemu-system-x86_64: Block node is read-only
2017-08-23 16:49:23.762+0000: shutting down, reason=crashed
Target:
2017-08-23T16:49:23.495478Z qemu-system-x86_64: Failed to load virtio_pci/modern_state:modern_state
2017-08-23T16:49:23.495505Z qemu-system-x86_64: Failed to load virtio/extra_state:extra_state
2017-08-23T16:49:23.495510Z qemu-system-x86_64: Failed to load virtio-balloon:virtio
2017-08-23T16:49:23.495515Z qemu-system-x86_64: error while loading state for instance 0x0 of device '0000:00:06.0/virtio-balloon'
2017-08-23T16:49:23.496071Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-23 16:49:23.797+0000: shutting down, reason=crashed
I was to eager to get this close-to-real so I don't have Davids fprintf's applied anymore - I'll build those and then run it in the debugger, but until then what I can see is that behavior slightly changes (worse).
It now crashes the guest on the source as well when aborting the migration.
I need to debug to confirm, but it seems it still aborts the migration
-> qemu-system-x86_64: load of migration failed: Input/output error
But then can't fall back to the source and crashes at
-> qemu-system-x86_64: Block node is read-only
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (40 preceding siblings ...)
2017-08-23 16:56 ` ChristianEhrhardt
@ 2017-08-23 17:01 ` ChristianEhrhardt
2017-08-23 18:10 ` Dr. David Alan Gilbert
` (6 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: ChristianEhrhardt @ 2017-08-23 17:01 UTC (permalink / raw)
To: qemu-devel
That was rc3 +:
- nbd-client-avoid-spurious-qio_channel_yield.patch
- the four patches mentioned in comment #43
I could also re-base onto master + pacthes or rc4 if there is one soon.
For now building with Davids debug statements applied again to check if we still abort around that assert.
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (41 preceding siblings ...)
2017-08-23 17:01 ` ChristianEhrhardt
@ 2017-08-23 18:10 ` Dr. David Alan Gilbert
2017-08-23 18:55 ` Dr. David Alan Gilbert
` (5 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: Dr. David Alan Gilbert @ 2017-08-23 18:10 UTC (permalink / raw)
To: qemu-devel
I need to recheck with that combo - I'd seen that error but only when
I'd commented out 'if (!blk->dev && !blk_name(blk)[0]) {' when
debugging earlier.
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (42 preceding siblings ...)
2017-08-23 18:10 ` Dr. David Alan Gilbert
@ 2017-08-23 18:55 ` Dr. David Alan Gilbert
2017-08-23 19:06 ` Dr. David Alan Gilbert
` (4 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: Dr. David Alan Gilbert @ 2017-08-23 18:55 UTC (permalink / raw)
To: qemu-devel
Looks good here, just retested:
here's teh top of my git:
f89f59fad5119f878aaedf711af90802ddcb99c7 nbd-client: avoid spurious qio_channel_yield() re-entry
cf26039a2b50f078b4ad90b88eea5bb28971c0d8 block: Update open_flags after ->inactivate() callback
8ccc527d84ec9a5052cfae19edbc44abb5ac03ae mirror: Mark target BB as "force allow inactivate"
34c3f17c99a43f261560edbd3da1188dd0c398ab block-backend: Allow more "can inactivate" cases
952ad9fd9dd43e92016d5bfc0ff93bdeaec13bf9 block-backend: Refactor inactivate check
1f296733876434118fd766cfef5eb6f29ecab6a8 Update version for v2.10.0-rc3 release
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (43 preceding siblings ...)
2017-08-23 18:55 ` Dr. David Alan Gilbert
@ 2017-08-23 19:06 ` Dr. David Alan Gilbert
2017-08-23 19:10 ` ChristianEhrhardt
` (3 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: Dr. David Alan Gilbert @ 2017-08-23 19:06 UTC (permalink / raw)
To: qemu-devel
just tested current head - 1eed33994e28d4a0437ba6e944bbc3ec5e4a29a0 -
seems to work for me.
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (44 preceding siblings ...)
2017-08-23 19:06 ` Dr. David Alan Gilbert
@ 2017-08-23 19:10 ` ChristianEhrhardt
2017-08-23 21:02 ` ChristianEhrhardt
` (2 subsequent siblings)
48 siblings, 0 replies; 50+ messages in thread
From: ChristianEhrhardt @ 2017-08-23 19:10 UTC (permalink / raw)
To: qemu-devel
Yeah seems to be slightly different than the former assert.
2017-08-23 18:41:54.556+0000: initiating migration
bdrv_inactivate_recurse: entry for drive-virtio-disk0
bdrv_inactivate_recurse: entry for #block133
bdrv_inactivate_recurse: entry for #block329
bdrv_inactivate_recurse: entry for #block202
bdrv_inactivate_recurse: exit end good for #block202
bdrv_inactivate_recurse: exit end good for #block329
bdrv_inactivate_recurse: entry for #block025
bdrv_inactivate_recurse: exit end good for #block025
bdrv_inactivate_recurse: exit end good for #block133
bdrv_inactivate_recurse: exit end good for drive-virtio-disk0
bdrv_inactivate_recurse: entry for #block799
bdrv_inactivate_recurse: entry for #block626
bdrv_inactivate_recurse: exit end good for #block626
bdrv_inactivate_recurse: exit end good for #block799
bdrv_inactivate_recurse: entry for drive-virtio-disk1
bdrv_inactivate_recurse: entry for #block570
bdrv_inactivate_recurse: entry for #block485
bdrv_inactivate_recurse: exit end good for #block485
bdrv_inactivate_recurse: exit end good for #block570
bdrv_inactivate_recurse: exit end good for drive-virtio-disk1
bdrv_inactivate_recurse: entry for #block1058
bdrv_inactivate_recurse: entry for #block920
bdrv_inactivate_recurse: exit end good for #block920
bdrv_inactivate_recurse: exit end good for #block1058
bdrv_inactivate_recurse: entry for drive-virtio-disk0
Unexpected error in bdrv_check_perm() at /build/qemu-0OVYHF/qemu-2.10~rc3+dfsg/block.c:1574:
2017-08-23T18:41:54.730131Z qemu-system-x86_64: Block node is read-only
Which is:
1553 /*
1554 * Check whether permissions on this node can be changed in a way that
1555 * @cumulative_perms and @cumulative_shared_perms are the new cumulative
1556 * permissions of all its parents. This involves checking whether all necessary
1557 * permission changes to child nodes can be performed.
1558 *
1559 * A call to this function must always be followed by a call to bdrv_set_perm()
1560 * or bdrv_abort_perm_update().
1561 */
1562 static int bdrv_check_perm(BlockDriverState *bs, uint64_t cumulative_perms,
1563 uint64_t cumulative_shared_perms,
1564 GSList *ignore_children, Error **errp)
1565 {
1566 BlockDriver *drv = bs->drv;
1567 BdrvChild *c;
1568 int ret;
1569
1570 /* Write permissions never work with read-only images */
1571 if ((cumulative_perms & (BLK_PERM_WRITE | BLK_PERM_WRITE_UNCHANGED)) &&
1572 !bdrv_is_writable(bs))
1573 {
1574 error_setg(errp, "Block node is read-only");
1575 return -EPERM;
1576 }
Adding in debug symbols to see in gdb which device that actually is showed me:
I don't know what you might need so the full struct:
(gdb) p *bs
$2 = {open_flags = 2050, read_only = false, encrypted = false, sg = false, probed = false, force_share = false, implicit = true,
drv = 0x1a67219800 <bdrv_mirror_top>, opaque = 0x0, aio_context = 0x1a684ae0d0, aio_notifiers = {lh_first = 0x1a6a4850e0},
walking_aio_notifiers = false, filename = "/var/lib/uvtool/libvirt/images/kvmguest-artful-normal.qcow", '\000' <repeats 4037 times>,
backing_file = "/var/lib/uvtool/libvirt/images/kvmguest-artful-normal.qcow", '\000' <repeats 4037 times>,
backing_format = "qcow2\000\000\000\000\000\000\000\000\000\000", full_open_options = 0x0,
exact_filename = "/var/lib/uvtool/libvirt/images/kvmguest-artful-normal.qcow", '\000' <repeats 4037 times>, backing = 0x1a6971a4a0,
file = 0x0, bl = {request_alignment = 1, max_pdiscard = 0, pdiscard_alignment = 0, max_pwrite_zeroes = 0, pwrite_zeroes_alignment = 0,
opt_transfer = 0, max_transfer = 0, min_mem_alignment = 512, opt_mem_alignment = 4096, max_iov = 1024}, supported_write_flags = 0,
supported_zero_flags = 0, node_name = "#block814", '\000' <repeats 22 times>, node_list = {tqe_next = 0x1a684b44d0, tqe_prev = 0x1a6b02e0c0},
bs_list = {tqe_next = 0x1a6a010030, tqe_prev = 0x1a6ab6bc50}, monitor_list = {tqe_next = 0x0, tqe_prev = 0x0}, refcnt = 3, op_blockers = {{
lh_first = 0x1a69e18e80}, {lh_first = 0x1a69e18ea0}, {lh_first = 0x1a69e18ec0}, {lh_first = 0x1a69e18ee0}, {lh_first = 0x1a69e18f00}, {
lh_first = 0x0}, {lh_first = 0x1a69e18f40}, {lh_first = 0x1a69e18f60}, {lh_first = 0x1a69e18f80}, {lh_first = 0x1a69e18fa0}, {
lh_first = 0x1a6989be30}, {lh_first = 0x1a69e18fc0}, {lh_first = 0x1a69e18fe0}, {lh_first = 0x1a69352e90}, {lh_first = 0x1a69352eb0}, {
lh_first = 0x1a69352ed0}}, job = 0x1a69e18bf0, inherits_from = 0x0, children = {lh_first = 0x1a6971a4a0}, parents = {
lh_first = 0x1a69e18e00}, options = 0x1a69b636a0, explicit_options = 0x1a69e16bb0, detect_zeroes = BLOCKDEV_DETECT_ZEROES_OPTIONS_OFF,
backing_blocker = 0x1a686e2e00, total_sectors = 16777216, before_write_notifiers = {notifiers = {lh_first = 0x0}}, write_threshold_offset = 0,
write_threshold_notifier = {notify = 0x0, node = {le_next = 0x0, le_prev = 0x0}}, dirty_bitmap_mutex = {lock = {__data = {__lock = 0,
__count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}},
__size = '\000' <repeats 39 times>, __align = 0}, initialized = true}, dirty_bitmaps = {lh_first = 0x0}, wr_highest_offset = {
value = 1190584320}, copy_on_read = 0, in_flight = 0, serialising_in_flight = 0, wakeup = false, io_plugged = 0, enable_write_cache = 0,
quiesce_counter = 0, write_gen = 2, reqs_lock = {locked = 0, ctx = 0x0, from_push = {slh_first = 0x0}, to_pop = {slh_first = 0x0},
handoff = 0, sequence = 0, holder = 0x0}, tracked_requests = {lh_first = 0x0}, flush_queue = {entries = {sqh_first = 0x0,
sqh_last = 0x1a69b63680}}, active_flush_req = false, flushed_gen = 2}
And that effectively is my root disk:
At least the trivial flag in the struct is "read_only = false".
Also on a FS level it is rw:
-rw------- 1 root root 717160448 Aug 23 18:50 /var/lib/uvtool/libvirt/images/kvmguest-artful-normal.qcow
(qemu is running privileged in this setup with UID 0, so no reason to mark that as read only IMHO)
So I checked the full context of the if that leads to the error:
(cumulative_perms & (BLK_PERM_WRITE | BLK_PERM_WRITE_UNCHANGED))
3 (in my case) & ( 0x2 | 0x4)
ok that is a match
So it goes further to
!bdrv_is_writable(bs)
Which effectively is:
!bdrv_is_read_only(bs) && !(bs->open_flags & BDRV_O_INACTIVE);
!bs->read_only ! (2050 & 0x800)
!false !(true)
true false
So the problem is that BDRV_O_INACTIVE is set?
Sorry I don't see why that is so (maybe too late for today).
But I hope that helps in understanding the remaining case.
I checked against your coommit list and I didn't have the following yet.
cf26039a2b50f078b4ad90b88eea5bb28971c0d8 block: Update open_flags after ->inactivate() callback
I took it now from the PULL 0/6 of Eric that appeared after my last test.
Building with that now to report once again.
If there is no build hickup that next test should just fit in before I fall asleep.
Hoping for the best to report a tested by in time if possible.
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
New
Status in libvirt package in Ubuntu:
Confirmed
Status in qemu package in Ubuntu:
Confirmed
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (45 preceding siblings ...)
2017-08-23 19:10 ` ChristianEhrhardt
@ 2017-08-23 21:02 ` ChristianEhrhardt
2017-09-02 8:04 ` Launchpad Bug Tracker
2017-12-15 16:10 ` Thomas Huth
48 siblings, 0 replies; 50+ messages in thread
From: ChristianEhrhardt @ 2017-08-23 21:02 UTC (permalink / raw)
To: qemu-devel
Yes, with all the series of [1] on top it finally works.
Saw it already being merged on master.
Expecting a late rc4 or early release tag and then wrap all it up.
Thanks everybody involved!
[1]: http://lists.nongnu.org/archive/html/qemu-
devel/2017-08/msg04513.html
** Changed in: qemu
Status: New => Fix Committed
** No longer affects: libvirt (Ubuntu)
** Changed in: qemu (Ubuntu)
Status: Confirmed => In Progress
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
Fix Committed
Status in qemu package in Ubuntu:
In Progress
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (46 preceding siblings ...)
2017-08-23 21:02 ` ChristianEhrhardt
@ 2017-09-02 8:04 ` Launchpad Bug Tracker
2017-12-15 16:10 ` Thomas Huth
48 siblings, 0 replies; 50+ messages in thread
From: Launchpad Bug Tracker @ 2017-09-02 8:04 UTC (permalink / raw)
To: qemu-devel
This bug was fixed in the package qemu - 1:2.10~rc4+dfsg-0ubuntu1
---------------
qemu (1:2.10~rc4+dfsg-0ubuntu1) artful; urgency=medium
* Merge with Upstream 2.10-rc4; This fixes a migration issue (LP: #1711602);
Remaining changes:
- qemu-kvm to systemd unit
- d/qemu-kvm-init: script for QEMU KVM preparation modules, ksm,
hugepages and architecture specifics
- d/qemu-kvm.service: systemd unit to call qemu-kvm-init
- d/qemu-system-common.install: install systemd unit and helper script
- d/qemu-system-common.maintscript: clean old sysv and upstart scripts
- d/qemu-system-common.qemu-kvm.default: defaults for
/etc/default/qemu-kvm
- d/rules: install /etc/default/qemu-kvm
- Enable nesting by default
- set nested=1 module option on intel. (is default on amd)
- re-load kvm_intel.ko if it was loaded without nested=1
- d/p/ubuntu/expose-vmx_qemu64cpu.patch: expose nested kvm by default
in qemu64 cpu type.
- d/p/ubuntu/enable-svm-by-default.patch: Enable nested svm by default
in qemu64 on amd
- libvirt/qemu user/group support
- qemu-system-common.postinst: remove acl placed by udev, and add udevadm
trigger.
- qemu-system-common.preinst: add kvm group if needed
- Distribution specific machine type
- d/p/ubuntu/define-ubuntu-machine-types.patch: define distro machine
types to ease future live vm migration.
- d/qemu-system-x86.NEWS Info on fixed machine type defintions
- improved dependencies
- Make qemu-system-common depend on qemu-block-extra
- Make qemu-utils depend on qemu-block-extra
- let qemu-utils recommend sharutils
- s390x support
- Create qemu-system-s390x package
- Include s390-ccw.img firmware
- Enable numa support for s390x
- ppc64[le] support
- d/qemu-system-ppc.links provide usr/bin/qemu-system-ppc64le symlink
- Enable seccomp for ppc64el
- bump libseccomp-dev dependency, 2.3 is the minimum for ppc64
- arch aware kvm wrappers
- update VCS-git to match the Artful branch
- disable missing x32 architecture
- d/rules: or32 is now named or1k (since 4a09d0bb)
- d/qemu-system-common.docs: new paths since (ac06724a)
- d/qemu-system-common.install: qmp-commands.txt removed, but replaced
by qapi-schema.json which is already packaged (since 4d8bb958)
- d/p/02_kfreebsd.patch: utimensat is no more optional upstream (Update
to Debian patch to match qemu 2.10)
- s390x package now builds correctly on all architectures (LP 1710695)
* Added changes:
- d/qemu-system-common.docs: adapt new path of live-block-operations.rst
since 8508eee7
- d/qemu-system-common.docs: adapt q35 config paths since 9ca019c1
- make nios2/hppa not installed explicitly until further stablized
- d/qemu-guest-agent.install: add the new guest agent reference man page
qemu-ga-ref
- d/qemu-system-common.install: add the now generated qapi/qmp reference
along the qapi intro
- d/not-installed: ignore further generated (since 56e8bdd4) files in
dh_missing that are already provided in other formats qemu-doc,
qemu-qmp-ref,qemu-ga-ref
- d/p/ubuntu/define-ubuntu-machine-types.patch: update to match new
changes in 2.10-rc4
-- Christian Ehrhardt <christian.ehrhardt@canonical.com> Fri, 25 Aug
2017 07:49:30 +0200
** Changed in: qemu (Ubuntu)
Status: In Progress => Fix Released
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
Fix Committed
Status in qemu package in Ubuntu:
Fix Released
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread
* [Qemu-devel] [Bug 1711602] Re: --copy-storage-all failing with qemu 2.10
[not found] <150305905460.11582.12289718300820278863.malonedeb@wampee.canonical.com>
` (47 preceding siblings ...)
2017-09-02 8:04 ` Launchpad Bug Tracker
@ 2017-12-15 16:10 ` Thomas Huth
48 siblings, 0 replies; 50+ messages in thread
From: Thomas Huth @ 2017-12-15 16:10 UTC (permalink / raw)
To: qemu-devel
** Changed in: qemu
Status: Fix Committed => Fix Released
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1711602
Title:
--copy-storage-all failing with qemu 2.10
Status in QEMU:
Fix Released
Status in qemu package in Ubuntu:
Fix Released
Bug description:
We fixed an issue around disk locking already in regard to qemu-nbd
[1], but there still seem to be issues.
$ virsh migrate --live --copy-storage-all kvmguest-artful-normal qemu+ssh://10.22.69.196/system
error: internal error: qemu unexpectedly closed the monitor: 2017-08-18T12:10:29.800397Z qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)
2017-08-18T12:10:48.545776Z qemu-system-x86_64: load of migration failed: Input/output error
Source libvirt log for the guest:
2017-08-18 12:09:08.251+0000: initiating migration
2017-08-18T12:09:08.809023Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
2017-08-18T12:09:08.809481Z qemu-system-x86_64: Unable to read from socket: Connection reset by peer
Target libvirt log for the guest:
2017-08-18T12:09:08.730911Z qemu-system-x86_64: load of migration failed: Input/output error
2017-08-18 12:09:09.010+0000: shutting down, reason=crashed
Given the timing it seems that the actual copy now works (it is busy ~10 seconds on my environment which would be the copy).
Also we don't see the old errors we saw before, but afterwards on the actual take-over it fails.
Dmesg has no related denials as often apparmor is in the mix.
Need to check libvirt logs of source [2] and target [3] in Detail.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02200.html
[2]: http://paste.ubuntu.com/25339356/
[3]: http://paste.ubuntu.com/25339358/
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1711602/+subscriptions
^ permalink raw reply [flat|nested] 50+ messages in thread