From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:45876) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ey7Ug-0000zU-OT for qemu-devel@nongnu.org; Mon, 19 Mar 2018 22:59:24 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ey7Uc-0004Gd-RE for qemu-devel@nongnu.org; Mon, 19 Mar 2018 22:59:22 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:50814 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1ey7Uc-0004GM-M6 for qemu-devel@nongnu.org; Mon, 19 Mar 2018 22:59:18 -0400 Date: Tue, 20 Mar 2018 04:59:13 +0200 From: "Michael S. Tsirkin" Message-ID: <20180320045134-mutt-send-email-mst@kernel.org> References: <1521197309-13544-1-git-send-email-wei.w.wang@intel.com> <1521197309-13544-5-git-send-email-wei.w.wang@intel.com> <20180316165319-mutt-send-email-mst@kernel.org> <5AAE4124.5010602@intel.com> <20180319062023-mutt-send-email-mst@kernel.org> <5AAF7C72.5070403@intel.com> <20180320004900-mutt-send-email-mst@kernel.org> <5AB06EE9.1030306@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5AB06EE9.1030306@intel.com> Subject: Re: [Qemu-devel] [virtio-dev] Re: [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Wei Wang Cc: qemu-devel@nongnu.org, virtio-dev@lists.oasis-open.org, quintela@redhat.com, dgilbert@redhat.com, pbonzini@redhat.com, liliang.opensource@gmail.com, yang.zhang.wz@gmail.com, quan.xu0@gmail.com, nilal@redhat.com, riel@redhat.com On Tue, Mar 20, 2018 at 10:16:09AM +0800, Wei Wang wrote: > On 03/20/2018 06:55 AM, Michael S. Tsirkin wrote: > > On Mon, Mar 19, 2018 at 05:01:38PM +0800, Wei Wang wrote: > > > On 03/19/2018 12:24 PM, Michael S. Tsirkin wrote: > > > > On Sun, Mar 18, 2018 at 06:36:20PM +0800, Wei Wang wrote: > > > > > On 03/16/2018 11:16 PM, Michael S. Tsirkin wrote: > > > > > > On Fri, Mar 16, 2018 at 06:48:28PM +0800, Wei Wang wrote: > > > > OTOH it seems that if thread stops nothing will wake it up > > > > whem vm is restarted. Such bahaviour change across vmstop/vmstart > > > > is unexpected. > > > > I do not understand why we want to increment the counter > > > > on vm stop though. It does make sense to stop the thread > > > > but why not resume where we left off when vm is resumed? > > > > > > > I'm not sure which counter we incremented. But it would be clear if we have > > > a high level view of how it works (it is symmetric actually). Basically, we > > > start the optimization when each round starts and stop it at the end of each > > > round (i.e. before we do the bitmap sync), as shown below: > > > > > > 1) 1st Round starts --> free_page_start > > > 2) 1st Round in progress.. > > > 3) 1st Round ends --> free_page_stop > > > 4) 2nd Round starts --> free_page_start > > > 5) 2nd Round in progress.. > > > 6) 2nd Round ends --> free_page_stop > > > ...... > > > > > > For example, in 2), the VM is stopped. virtio_balloon_poll_free_page_hints > > > finds the vq is empty (i.e. elem == NULL) and the runstate is stopped, the > > > optimization thread exits immediately. That is, this optimization thread is > > > gone forever (the optimization we can do for this round is done). We won't > > > know when would the VM be woken up: > > > A) If the VM is woken up very soon when the migration thread is still in > > > progress of 2), then in 4) a new optimization thread (not the same one for > > > the first round) will be created and start the optimization for the 2nd > > > round as usual (If you have questions about 3) in this case, that > > > free_page_stop will do nothing than just return, since the optimization > > > thread has exited) ; > > > B) If the VM is woken up after the whole migration has ended, there is still > > > no point in resuming the optimization. > > > > > > I think this would be the simple design for the first release of this > > > optimization. There are possibilities to improve case A) above by continuing > > > optimization for the 1st Round as it is still in progress, but I think > > > adding that complexity for this rare case wouldn't be worthwhile (at least > > > for now). What would you think? > > > > > > > > > Best, > > > Wei > > In my opinion this just makes the patch very messy. > > > > E.g. attempts to attach a debugger to the guest will call vmstop and > > then behaviour changes. This is a receipe for heisenbugs which are then > > extremely painful to debug. > > > > It is not really hard to make things symmetrical: > > e.g. if you stop on vmstop then you should start on vmstart, etc. > > And stopping thread should not involve a bunch of state > > changes, just stop it and that's it. > > > > "stop it" - do you mean to > 1) make the thread exit (i.e.make virtio_balloon_poll_free_page_hints exit > the while loop and return NULL); or > 2) keep the thread staying in the while loop but yield running (e.g. > sleep(1) or block on a mutex)? (or please let me know if you suggested a > different implementation about stopping the thread) I would say it makes more sense to make it block on something. BTW I still think you are engaging in premature optimization here. What you are doing here is a "data plane for balloon". I would make the feature work first by processing this in a BH. Creating threads immediately opens up questions of isolation, cgroups etc. > If we go with 1), then we would not be able to resume the thread. > If we go with 2), then there is no guarantee to make the thread continue to > run immediately (there will be a scheduling delay which is not predictable) > when the vm is woken up. If the thread cannot run immediately when the the > VM is woken up, it will be effectively the same as 1). > > In terms of heisenbugs, I think we can recommend developers (of this > feature) to use methods that don't stop the VM (e.g. print). > > Best, > Wei Sorry this does not makes sense to me. The reality is that for most people migration works just fine. These patches implement a niche optimization. #1 priority should be to break no existing flows. -- MST From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: virtio-dev-return-3612-cohuck=redhat.com@lists.oasis-open.org Sender: List-Post: List-Help: List-Unsubscribe: List-Subscribe: Received: from lists.oasis-open.org (oasis-open.org [66.179.20.138]) by lists.oasis-open.org (Postfix) with ESMTP id A2AB35818036 for ; Mon, 19 Mar 2018 19:59:19 -0700 (PDT) Date: Tue, 20 Mar 2018 04:59:13 +0200 From: "Michael S. Tsirkin" Message-ID: <20180320045134-mutt-send-email-mst@kernel.org> References: <1521197309-13544-1-git-send-email-wei.w.wang@intel.com> <1521197309-13544-5-git-send-email-wei.w.wang@intel.com> <20180316165319-mutt-send-email-mst@kernel.org> <5AAE4124.5010602@intel.com> <20180319062023-mutt-send-email-mst@kernel.org> <5AAF7C72.5070403@intel.com> <20180320004900-mutt-send-email-mst@kernel.org> <5AB06EE9.1030306@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5AB06EE9.1030306@intel.com> Subject: Re: [virtio-dev] Re: [PATCH v5 4/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT To: Wei Wang Cc: qemu-devel@nongnu.org, virtio-dev@lists.oasis-open.org, quintela@redhat.com, dgilbert@redhat.com, pbonzini@redhat.com, liliang.opensource@gmail.com, yang.zhang.wz@gmail.com, quan.xu0@gmail.com, nilal@redhat.com, riel@redhat.com List-ID: On Tue, Mar 20, 2018 at 10:16:09AM +0800, Wei Wang wrote: > On 03/20/2018 06:55 AM, Michael S. Tsirkin wrote: > > On Mon, Mar 19, 2018 at 05:01:38PM +0800, Wei Wang wrote: > > > On 03/19/2018 12:24 PM, Michael S. Tsirkin wrote: > > > > On Sun, Mar 18, 2018 at 06:36:20PM +0800, Wei Wang wrote: > > > > > On 03/16/2018 11:16 PM, Michael S. Tsirkin wrote: > > > > > > On Fri, Mar 16, 2018 at 06:48:28PM +0800, Wei Wang wrote: > > > > OTOH it seems that if thread stops nothing will wake it up > > > > whem vm is restarted. Such bahaviour change across vmstop/vmstart > > > > is unexpected. > > > > I do not understand why we want to increment the counter > > > > on vm stop though. It does make sense to stop the thread > > > > but why not resume where we left off when vm is resumed? > > > > > > > I'm not sure which counter we incremented. But it would be clear if we have > > > a high level view of how it works (it is symmetric actually). Basically, we > > > start the optimization when each round starts and stop it at the end of each > > > round (i.e. before we do the bitmap sync), as shown below: > > > > > > 1) 1st Round starts --> free_page_start > > > 2) 1st Round in progress.. > > > 3) 1st Round ends --> free_page_stop > > > 4) 2nd Round starts --> free_page_start > > > 5) 2nd Round in progress.. > > > 6) 2nd Round ends --> free_page_stop > > > ...... > > > > > > For example, in 2), the VM is stopped. virtio_balloon_poll_free_page_hints > > > finds the vq is empty (i.e. elem == NULL) and the runstate is stopped, the > > > optimization thread exits immediately. That is, this optimization thread is > > > gone forever (the optimization we can do for this round is done). We won't > > > know when would the VM be woken up: > > > A) If the VM is woken up very soon when the migration thread is still in > > > progress of 2), then in 4) a new optimization thread (not the same one for > > > the first round) will be created and start the optimization for the 2nd > > > round as usual (If you have questions about 3) in this case, that > > > free_page_stop will do nothing than just return, since the optimization > > > thread has exited) ; > > > B) If the VM is woken up after the whole migration has ended, there is still > > > no point in resuming the optimization. > > > > > > I think this would be the simple design for the first release of this > > > optimization. There are possibilities to improve case A) above by continuing > > > optimization for the 1st Round as it is still in progress, but I think > > > adding that complexity for this rare case wouldn't be worthwhile (at least > > > for now). What would you think? > > > > > > > > > Best, > > > Wei > > In my opinion this just makes the patch very messy. > > > > E.g. attempts to attach a debugger to the guest will call vmstop and > > then behaviour changes. This is a receipe for heisenbugs which are then > > extremely painful to debug. > > > > It is not really hard to make things symmetrical: > > e.g. if you stop on vmstop then you should start on vmstart, etc. > > And stopping thread should not involve a bunch of state > > changes, just stop it and that's it. > > > > "stop it" - do you mean to > 1) make the thread exit (i.e.make virtio_balloon_poll_free_page_hints exit > the while loop and return NULL); or > 2) keep the thread staying in the while loop but yield running (e.g. > sleep(1) or block on a mutex)? (or please let me know if you suggested a > different implementation about stopping the thread) I would say it makes more sense to make it block on something. BTW I still think you are engaging in premature optimization here. What you are doing here is a "data plane for balloon". I would make the feature work first by processing this in a BH. Creating threads immediately opens up questions of isolation, cgroups etc. > If we go with 1), then we would not be able to resume the thread. > If we go with 2), then there is no guarantee to make the thread continue to > run immediately (there will be a scheduling delay which is not predictable) > when the vm is woken up. If the thread cannot run immediately when the the > VM is woken up, it will be effectively the same as 1). > > In terms of heisenbugs, I think we can recommend developers (of this > feature) to use methods that don't stop the VM (e.g. print). > > Best, > Wei Sorry this does not makes sense to me. The reality is that for most people migration works just fine. These patches implement a niche optimization. #1 priority should be to break no existing flows. -- MST --------------------------------------------------------------------- To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org