All of lore.kernel.org
 help / color / mirror / Atom feed
From: Xiao Guangrong <guangrong.xiao@gmail.com>
To: Peter Xu <peterx@redhat.com>
Cc: kvm@vger.kernel.org, mst@redhat.com, mtosatti@redhat.com,
	Xiao Guangrong <xiaoguangrong@tencent.com>,
	qemu-devel@nongnu.org, pbonzini@redhat.com
Subject: Re: [PATCH 3/8] migration: support to detect compression and decompression errors
Date: Thu, 22 Mar 2018 20:03:53 +0800	[thread overview]
Message-ID: <48da6d2b-f5f4-a077-f282-ed52392c17f9@gmail.com> (raw)
In-Reply-To: <20180321100043.GA30634@xz-mi>



On 03/21/2018 06:00 PM, Peter Xu wrote:
> On Tue, Mar 13, 2018 at 03:57:34PM +0800, guangrong.xiao@gmail.com wrote:
>> From: Xiao Guangrong <xiaoguangrong@tencent.com>
>>
>> Currently the page being compressed is allowed to be updated by
>> the VM on the source QEMU, correspondingly the destination QEMU
>> just ignores the decompression error. However, we completely miss
>> the chance to catch real errors, then the VM is corrupted silently
>>
>> To make the migration more robuster, we copy the page to a buffer
>> first to avoid it being written by VM, then detect and handle the
>> errors of both compression and decompression errors properly
> 
> Not sure I missed anything important, but I'll just shoot my thoughts
> as questions (again)...
> 
> Actually this is a more general question? Say, even without
> compression, we can be sending a page that is being modified.
> 
> However, IMHO we don't need to worry that, since if that page is
> modified, we'll definitely send that page again, so the new page will
> replace the old.  So on destination side, even if decompress() failed
> on a page it'll be fine IMHO.  Though now we are copying the corrupted
> buffer.  On that point, I fully agree that we should not - maybe we
> can just drop the page entirely?
> 
> For non-compress pages, we can't detect that, so we'll copy the page
> even if corrupted.
> 
> The special part for compression would be: would the deflate() fail if
> there is concurrent update to the buffer being compressed?  And would
> that corrupt the whole compression stream, or it would only fail the
> deflate() call?

It is not the same for normal page and compressed page.

For the normal page, the dirty-log mechanism in QEMU and the infrastructure
of the network (e.g, TCP) can make sure that the modified memory will
be posted to the destination without corruption.

However, nothing can guarantee compression/decompression is BUG-free,
e,g, consider the case, in the last step, vCPUs & dirty-log are paused and
the memory is compressed and posted to destination, if there is any error
in compression/decompression, VM dies silently.

WARNING: multiple messages have this Message-ID (diff)
From: Xiao Guangrong <guangrong.xiao@gmail.com>
To: Peter Xu <peterx@redhat.com>
Cc: pbonzini@redhat.com, mst@redhat.com, mtosatti@redhat.com,
	Xiao Guangrong <xiaoguangrong@tencent.com>,
	qemu-devel@nongnu.org, kvm@vger.kernel.org
Subject: Re: [Qemu-devel] [PATCH 3/8] migration: support to detect compression and decompression errors
Date: Thu, 22 Mar 2018 20:03:53 +0800	[thread overview]
Message-ID: <48da6d2b-f5f4-a077-f282-ed52392c17f9@gmail.com> (raw)
In-Reply-To: <20180321100043.GA30634@xz-mi>



On 03/21/2018 06:00 PM, Peter Xu wrote:
> On Tue, Mar 13, 2018 at 03:57:34PM +0800, guangrong.xiao@gmail.com wrote:
>> From: Xiao Guangrong <xiaoguangrong@tencent.com>
>>
>> Currently the page being compressed is allowed to be updated by
>> the VM on the source QEMU, correspondingly the destination QEMU
>> just ignores the decompression error. However, we completely miss
>> the chance to catch real errors, then the VM is corrupted silently
>>
>> To make the migration more robuster, we copy the page to a buffer
>> first to avoid it being written by VM, then detect and handle the
>> errors of both compression and decompression errors properly
> 
> Not sure I missed anything important, but I'll just shoot my thoughts
> as questions (again)...
> 
> Actually this is a more general question? Say, even without
> compression, we can be sending a page that is being modified.
> 
> However, IMHO we don't need to worry that, since if that page is
> modified, we'll definitely send that page again, so the new page will
> replace the old.  So on destination side, even if decompress() failed
> on a page it'll be fine IMHO.  Though now we are copying the corrupted
> buffer.  On that point, I fully agree that we should not - maybe we
> can just drop the page entirely?
> 
> For non-compress pages, we can't detect that, so we'll copy the page
> even if corrupted.
> 
> The special part for compression would be: would the deflate() fail if
> there is concurrent update to the buffer being compressed?  And would
> that corrupt the whole compression stream, or it would only fail the
> deflate() call?

It is not the same for normal page and compressed page.

For the normal page, the dirty-log mechanism in QEMU and the infrastructure
of the network (e.g, TCP) can make sure that the modified memory will
be posted to the destination without corruption.

However, nothing can guarantee compression/decompression is BUG-free,
e,g, consider the case, in the last step, vCPUs & dirty-log are paused and
the memory is compressed and posted to destination, if there is any error
in compression/decompression, VM dies silently.

  reply	other threads:[~2018-03-22 12:03 UTC|newest]

Thread overview: 126+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-13  7:57 [PATCH 0/8] migration: improve and cleanup compression guangrong.xiao
2018-03-13  7:57 ` [Qemu-devel] " guangrong.xiao
2018-03-13  7:57 ` [PATCH 1/8] migration: stop compressing page in migration thread guangrong.xiao
2018-03-13  7:57   ` [Qemu-devel] " guangrong.xiao
2018-03-15 10:25   ` Dr. David Alan Gilbert
2018-03-15 10:25     ` [Qemu-devel] " Dr. David Alan Gilbert
2018-03-16  8:05     ` Xiao Guangrong
2018-03-16  8:05       ` [Qemu-devel] " Xiao Guangrong
2018-03-19 12:11       ` Dr. David Alan Gilbert
2018-03-19 12:11         ` [Qemu-devel] " Dr. David Alan Gilbert
2018-03-21  8:19       ` Peter Xu
2018-03-21  8:19         ` [Qemu-devel] " Peter Xu
2018-03-22 11:38         ` Xiao Guangrong
2018-03-22 11:38           ` [Qemu-devel] " Xiao Guangrong
2018-03-26  9:02           ` Peter Xu
2018-03-26  9:02             ` [Qemu-devel] " Peter Xu
2018-03-26 15:43             ` Xiao Guangrong
2018-03-26 15:43               ` [Qemu-devel] " Xiao Guangrong
2018-03-27  7:33               ` Peter Xu
2018-03-27  7:33                 ` [Qemu-devel] " Peter Xu
2018-03-27 19:12               ` Dr. David Alan Gilbert
2018-03-27 19:12                 ` [Qemu-devel] " Dr. David Alan Gilbert
2018-03-28  3:01   ` Wang, Wei W
2018-03-28  3:01     ` [Qemu-devel] " Wang, Wei W
2018-03-27 15:24     ` Xiao Guangrong
2018-03-27 15:24       ` [Qemu-devel] " Xiao Guangrong
2018-03-28  7:30       ` Wei Wang
2018-03-28  7:30         ` [Qemu-devel] " Wei Wang
2018-03-28  7:37         ` Peter Xu
2018-03-28  7:37           ` [Qemu-devel] " Peter Xu
2018-03-28  8:30           ` Wei Wang
2018-03-28  8:30             ` [Qemu-devel] " Wei Wang
2018-03-13  7:57 ` [PATCH 2/8] migration: stop allocating and freeing memory frequently guangrong.xiao
2018-03-13  7:57   ` [Qemu-devel] " guangrong.xiao
2018-03-15 11:03   ` Dr. David Alan Gilbert
2018-03-15 11:03     ` [Qemu-devel] " Dr. David Alan Gilbert
2018-03-16  8:19     ` Xiao Guangrong
2018-03-16  8:19       ` [Qemu-devel] " Xiao Guangrong
2018-03-19 10:54       ` Dr. David Alan Gilbert
2018-03-19 10:54         ` [Qemu-devel] " Dr. David Alan Gilbert
2018-03-19 12:11         ` Xiao Guangrong
2018-03-19 12:11           ` [Qemu-devel] " Xiao Guangrong
2018-03-19  1:49   ` [PATCH 2/8] migration: stop allocating and freeingmemory frequently jiang.biao2
2018-03-19  1:49     ` [Qemu-devel] " jiang.biao2
2018-03-19  4:03     ` Xiao Guangrong
2018-03-19  4:03       ` [Qemu-devel] " Xiao Guangrong
2018-03-19  4:48       ` [PATCH 2/8] migration: stop allocating andfreeingmemory frequently jiang.biao2
2018-03-19  4:48         ` [Qemu-devel] " jiang.biao2
2018-03-21  9:06   ` [PATCH 2/8] migration: stop allocating and freeing memory frequently Peter Xu
2018-03-21  9:06     ` [Qemu-devel] " Peter Xu
2018-03-22 11:57     ` Xiao Guangrong
2018-03-22 11:57       ` [Qemu-devel] " Xiao Guangrong
2018-03-27  7:07       ` Peter Xu
2018-03-27  7:07         ` [Qemu-devel] " Peter Xu
2018-03-13  7:57 ` [PATCH 3/8] migration: support to detect compression and decompression errors guangrong.xiao
2018-03-13  7:57   ` [Qemu-devel] " guangrong.xiao
2018-03-15 11:29   ` Dr. David Alan Gilbert
2018-03-15 11:29     ` [Qemu-devel] " Dr. David Alan Gilbert
2018-03-16  8:25     ` Xiao Guangrong
2018-03-16  8:25       ` [Qemu-devel] " Xiao Guangrong
2018-03-19  7:56   ` [PATCH 3/8] migration: support to detect compressionand " jiang.biao2
2018-03-19  7:56     ` [Qemu-devel] " jiang.biao2
2018-03-19  8:01     ` Xiao Guangrong
2018-03-19  8:01       ` [Qemu-devel] " Xiao Guangrong
2018-03-21 10:00   ` [PATCH 3/8] migration: support to detect compression and " Peter Xu
2018-03-21 10:00     ` [Qemu-devel] " Peter Xu
2018-03-22 12:03     ` Xiao Guangrong [this message]
2018-03-22 12:03       ` Xiao Guangrong
2018-03-27  7:22       ` Peter Xu
2018-03-27  7:22         ` [Qemu-devel] " Peter Xu
2018-03-26 19:42         ` Xiao Guangrong
2018-03-26 19:42           ` [Qemu-devel] " Xiao Guangrong
2018-03-27 11:17           ` Peter Xu
2018-03-27 11:17             ` [Qemu-devel] " Peter Xu
2018-03-27  1:20             ` Xiao Guangrong
2018-03-27  1:20               ` [Qemu-devel] " Xiao Guangrong
2018-03-28  0:43               ` [PATCH 3/8] migration: support to detectcompression " jiang.biao2
2018-03-28  0:43                 ` [Qemu-devel] " jiang.biao2
2018-03-27 14:35                 ` Xiao Guangrong
2018-03-27 14:35                   ` [Qemu-devel] " Xiao Guangrong
2018-03-28  3:03                   ` Peter Xu
2018-03-28  3:03                     ` [Qemu-devel] " Peter Xu
2018-03-28  4:08                     ` [PATCH 3/8] migration: support todetectcompression " jiang.biao2
2018-03-28  4:08                       ` [Qemu-devel] " jiang.biao2
2018-03-28  4:20                       ` Peter Xu
2018-03-28  4:20                         ` [Qemu-devel] " Peter Xu
2018-03-27 18:44                         ` Xiao Guangrong
2018-03-27 18:44                           ` [Qemu-devel] " Xiao Guangrong
2018-03-28  8:07                           ` [PATCH 3/8] migration: support todetectcompressionand " jiang.biao2
2018-03-28  8:07                             ` [Qemu-devel] " jiang.biao2
2018-03-13  7:57 ` [PATCH 4/8] migration: introduce control_save_page() guangrong.xiao
2018-03-13  7:57   ` [Qemu-devel] " guangrong.xiao
2018-03-15 11:37   ` Dr. David Alan Gilbert
2018-03-15 11:37     ` [Qemu-devel] " Dr. David Alan Gilbert
2018-03-16  8:52     ` Xiao Guangrong
2018-03-16  8:52       ` [Qemu-devel] " Xiao Guangrong
2018-03-27  7:47     ` Peter Xu
2018-03-27  7:47       ` [Qemu-devel] " Peter Xu
2018-03-13  7:57 ` [PATCH 5/8] migration: move calling control_save_page to the common place guangrong.xiao
2018-03-13  7:57   ` [Qemu-devel] " guangrong.xiao
2018-03-15 11:47   ` Dr. David Alan Gilbert
2018-03-15 11:47     ` [Qemu-devel] " Dr. David Alan Gilbert
2018-03-16  8:59     ` Xiao Guangrong
2018-03-16  8:59       ` [Qemu-devel] " Xiao Guangrong
2018-03-19 13:15       ` Dr. David Alan Gilbert
2018-03-19 13:15         ` [Qemu-devel] " Dr. David Alan Gilbert
2018-03-27 12:35   ` Peter Xu
2018-03-27 12:35     ` [Qemu-devel] " Peter Xu
2018-03-13  7:57 ` [PATCH 6/8] migration: move calling save_zero_page " guangrong.xiao
2018-03-13  7:57   ` [Qemu-devel] " guangrong.xiao
2018-03-15 12:27   ` Dr. David Alan Gilbert
2018-03-15 12:27     ` [Qemu-devel] " Dr. David Alan Gilbert
2018-03-27 12:49   ` Peter Xu
2018-03-27 12:49     ` [Qemu-devel] " Peter Xu
2018-03-13  7:57 ` [PATCH 7/8] migration: introduce save_normal_page() guangrong.xiao
2018-03-13  7:57   ` [Qemu-devel] " guangrong.xiao
2018-03-15 12:30   ` Dr. David Alan Gilbert
2018-03-15 12:30     ` [Qemu-devel] " Dr. David Alan Gilbert
2018-03-27 12:54   ` Peter Xu
2018-03-27 12:54     ` [Qemu-devel] " Peter Xu
2018-03-13  7:57 ` [PATCH 8/8] migration: remove ram_save_compressed_page() guangrong.xiao
2018-03-13  7:57   ` [Qemu-devel] " guangrong.xiao
2018-03-15 12:32   ` Dr. David Alan Gilbert
2018-03-15 12:32     ` [Qemu-devel] " Dr. David Alan Gilbert
2018-03-27 12:56   ` Peter Xu
2018-03-27 12:56     ` [Qemu-devel] " Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48da6d2b-f5f4-a077-f282-ed52392c17f9@gmail.com \
    --to=guangrong.xiao@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=xiaoguangrong@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.