From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933013AbcCIP2Z (ORCPT ); Wed, 9 Mar 2016 10:28:25 -0500 Received: from mga03.intel.com ([134.134.136.65]:49924 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753662AbcCIP2S convert rfc822-to-8bit (ORCPT ); Wed, 9 Mar 2016 10:28:18 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,311,1455004800"; d="scan'208";a="62878645" From: "Li, Liang Z" To: Roman Kagan , "Michael S. Tsirkin" CC: "Dr. David Alan Gilbert" , "ehabkost@redhat.com" , "kvm@vger.kernel.org" , "quintela@redhat.com" , "linux-kernel@vger.kernel.org" , "qemu-devel@nongnu.org" , "linux-mm@kvack.org" , "amit.shah@redhat.com" , "pbonzini@redhat.com" , "akpm@linux-foundation.org" , "virtualization@lists.linux-foundation.org" , "rth@twiddle.net" , "riel@redhat.com" Subject: RE: [Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization Thread-Topic: [Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization Thread-Index: AQHRdTqPjTxTnYWKZEWM4lf/HjT6rZ9HeJiAgAEJK0D//+lWAIAAiTOQ//+bAoCAAMUSUP//hAyAABHVw1AAK0jMgABYA90w///aHwCAA1PTAP//b1yQ Date: Wed, 9 Mar 2016 15:27:54 +0000 Message-ID: References: <20160304081411.GD9100@rkaganb.sw.ru> <20160304102346.GB2479@rkaganb.sw.ru> <20160304163246-mutt-send-email-mst@redhat.com> <20160305214748-mutt-send-email-mst@redhat.com> <20160307110852-mutt-send-email-mst@redhat.com> <20160309142851.GA9715@rkaganb.sw.ru> In-Reply-To: <20160309142851.GA9715@rkaganb.sw.ru> Accept-Language: zh-CN, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiZGM4NjJkODItZmZmZC00M2FjLTlmZmItOThlYjMyOTA5MzNiIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX0lDIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE1LjkuNi42IiwiVHJ1c3RlZExhYmVsSGFzaCI6Iks0SVcyT0daWmo3SjZQTlNOck1udXBDRHZYTCtwVlpPbFFYcFFSV2xBZDA9In0= x-ctpclassification: CTP_IC x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > On Mon, Mar 07, 2016 at 01:40:06PM +0200, Michael S. Tsirkin wrote: > > On Mon, Mar 07, 2016 at 06:49:19AM +0000, Li, Liang Z wrote: > > > > > No. And it's exactly what I mean. The ballooned memory is still > > > > > processed during live migration without skipping. The live > > > > > migration code is > > > > in migration/ram.c. > > > > > > > > So if guest acknowledged VIRTIO_BALLOON_F_MUST_TELL_HOST, we > can > > > > teach qemu to skip these pages. > > > > Want to write a patch to do this? > > > > > > > > > > Yes, we really can teach qemu to skip these pages and it's not hard. > > > The problem is the poor performance, this PV solution > > > > Balloon is always PV. And do not call patches solutions please. > > > > > is aimed to make it more > > > efficient and reduce the performance impact on guest. > > > > We need to get a bit beyond this. You are making multiple changes, it > > seems to make sense to split it all up, and analyse each change > > separately. > > Couldn't agree more. > > There are three stages in this optimization: > > 1) choosing which pages to skip > > 2) communicating them from guest to host > > 3) skip transferring uninteresting pages to the remote side on migration > > For (3) there seems to be a low-hanging fruit to amend > migration/ram.c:iz_zero_range() to consult /proc/self/pagemap. This would > work for guest RAM that hasn't been touched yet or which has been > ballooned out. > > For (1) I've been trying to make a point that skipping clean pages is much > more likely to result in noticable benefit than free pages only. > I am considering to drop the pagecache before getting the free pages. > As for (2), we do seem to have a problem with the existing balloon: > according to your measurements it's very slow; besides, I guess it plays badly I didn't say communicating is slow. Even this is very slow, my solution use bitmap instead of PFNs, there is fewer data traffic, so it's faster than the existing balloon which use PFNs. > with transparent huge pages (as both the guest and the host work with one > 4k page at a time). This is a problem for other use cases of balloon (e.g. as a > facility for resource management); tackling that appears a more natural > application for optimization efforts. > > Thanks, > Roman. From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Li, Liang Z" Subject: RE: [Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization Date: Wed, 9 Mar 2016 15:27:54 +0000 Message-ID: References: <20160304081411.GD9100@rkaganb.sw.ru> <20160304102346.GB2479@rkaganb.sw.ru> <20160304163246-mutt-send-email-mst@redhat.com> <20160305214748-mutt-send-email-mst@redhat.com> <20160307110852-mutt-send-email-mst@redhat.com> <20160309142851.GA9715@rkaganb.sw.ru> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Cc: "Dr. David Alan Gilbert" , "ehabkost@redhat.com" , "kvm@vger.kernel.org" , "quintela@redhat.com" , "linux-kernel@vger.kernel.org" , "qemu-devel@nongnu.org" , "linux-mm@kvack.org" , "amit.shah@redhat.com" , "pbonzini@redhat.com" , "akpm@linux-foundation.org" , "virtualization@lists.linux-foundation.org" , "rth@twiddle.net" , "riel@redhat.com" To: Roman Kagan , "Michael S. Tsirkin" Return-path: In-Reply-To: <20160309142851.GA9715@rkaganb.sw.ru> Content-Language: en-US Sender: owner-linux-mm@kvack.org List-Id: kvm.vger.kernel.org > On Mon, Mar 07, 2016 at 01:40:06PM +0200, Michael S. Tsirkin wrote: > > On Mon, Mar 07, 2016 at 06:49:19AM +0000, Li, Liang Z wrote: > > > > > No. And it's exactly what I mean. The ballooned memory is still > > > > > processed during live migration without skipping. The live > > > > > migration code is > > > > in migration/ram.c. > > > > > > > > So if guest acknowledged VIRTIO_BALLOON_F_MUST_TELL_HOST, we > can > > > > teach qemu to skip these pages. > > > > Want to write a patch to do this? > > > > > > > > > > Yes, we really can teach qemu to skip these pages and it's not hard. > > > The problem is the poor performance, this PV solution > > > > Balloon is always PV. And do not call patches solutions please. > > > > > is aimed to make it more > > > efficient and reduce the performance impact on guest. > > > > We need to get a bit beyond this. You are making multiple changes, it > > seems to make sense to split it all up, and analyse each change > > separately. >=20 > Couldn't agree more. >=20 > There are three stages in this optimization: >=20 > 1) choosing which pages to skip >=20 > 2) communicating them from guest to host >=20 > 3) skip transferring uninteresting pages to the remote side on migration >=20 > For (3) there seems to be a low-hanging fruit to amend > migration/ram.c:iz_zero_range() to consult /proc/self/pagemap. This woul= d > work for guest RAM that hasn't been touched yet or which has been > ballooned out. >=20 > For (1) I've been trying to make a point that skipping clean pages is muc= h > more likely to result in noticable benefit than free pages only. >=20 I am considering to drop the pagecache before getting the free pages.=20 > As for (2), we do seem to have a problem with the existing balloon: > according to your measurements it's very slow; besides, I guess it plays = badly I didn't say communicating is slow. Even this is very slow, my solution use= bitmap instead of PFNs, there is fewer data traffic, so it's faster than the existing balloon= which use PFNs. > with transparent huge pages (as both the guest and the host work with one > 4k page at a time). This is a problem for other use cases of balloon (e.= g. as a > facility for resource management); tackling that appears a more natural > application for optimization efforts. >=20 > Thanks, > Roman. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54153) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1adg2D-0005PR-0k for qemu-devel@nongnu.org; Wed, 09 Mar 2016 10:28:25 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1adg27-0001vY-To for qemu-devel@nongnu.org; Wed, 09 Mar 2016 10:28:24 -0500 Received: from mga03.intel.com ([134.134.136.65]:51114) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1adg27-0001ub-Iz for qemu-devel@nongnu.org; Wed, 09 Mar 2016 10:28:19 -0500 From: "Li, Liang Z" Date: Wed, 9 Mar 2016 15:27:54 +0000 Message-ID: References: <20160304081411.GD9100@rkaganb.sw.ru> <20160304102346.GB2479@rkaganb.sw.ru> <20160304163246-mutt-send-email-mst@redhat.com> <20160305214748-mutt-send-email-mst@redhat.com> <20160307110852-mutt-send-email-mst@redhat.com> <20160309142851.GA9715@rkaganb.sw.ru> In-Reply-To: <20160309142851.GA9715@rkaganb.sw.ru> Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [Qemu-devel] [RFC qemu 0/4] A PV solution for live migration optimization List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Roman Kagan , "Michael S. Tsirkin" Cc: "ehabkost@redhat.com" , "kvm@vger.kernel.org" , "quintela@redhat.com" , "qemu-devel@nongnu.org" , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "amit.shah@redhat.com" , "pbonzini@redhat.com" , "akpm@linux-foundation.org" , "virtualization@lists.linux-foundation.org" , "Dr. David Alan Gilbert" , "rth@twiddle.net" > On Mon, Mar 07, 2016 at 01:40:06PM +0200, Michael S. Tsirkin wrote: > > On Mon, Mar 07, 2016 at 06:49:19AM +0000, Li, Liang Z wrote: > > > > > No. And it's exactly what I mean. The ballooned memory is still > > > > > processed during live migration without skipping. The live > > > > > migration code is > > > > in migration/ram.c. > > > > > > > > So if guest acknowledged VIRTIO_BALLOON_F_MUST_TELL_HOST, we > can > > > > teach qemu to skip these pages. > > > > Want to write a patch to do this? > > > > > > > > > > Yes, we really can teach qemu to skip these pages and it's not hard. > > > The problem is the poor performance, this PV solution > > > > Balloon is always PV. And do not call patches solutions please. > > > > > is aimed to make it more > > > efficient and reduce the performance impact on guest. > > > > We need to get a bit beyond this. You are making multiple changes, it > > seems to make sense to split it all up, and analyse each change > > separately. >=20 > Couldn't agree more. >=20 > There are three stages in this optimization: >=20 > 1) choosing which pages to skip >=20 > 2) communicating them from guest to host >=20 > 3) skip transferring uninteresting pages to the remote side on migration >=20 > For (3) there seems to be a low-hanging fruit to amend > migration/ram.c:iz_zero_range() to consult /proc/self/pagemap. This woul= d > work for guest RAM that hasn't been touched yet or which has been > ballooned out. >=20 > For (1) I've been trying to make a point that skipping clean pages is muc= h > more likely to result in noticable benefit than free pages only. >=20 I am considering to drop the pagecache before getting the free pages.=20 > As for (2), we do seem to have a problem with the existing balloon: > according to your measurements it's very slow; besides, I guess it plays = badly I didn't say communicating is slow. Even this is very slow, my solution use= bitmap instead of PFNs, there is fewer data traffic, so it's faster than the existing balloon= which use PFNs. > with transparent huge pages (as both the guest and the host work with one > 4k page at a time). This is a problem for other use cases of balloon (e.= g. as a > facility for resource management); tackling that appears a more natural > application for optimization efforts. >=20 > Thanks, > Roman.