From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756989AbcLPBNH convert rfc822-to-8bit (ORCPT ); Thu, 15 Dec 2016 20:13:07 -0500 Received: from mga09.intel.com ([134.134.136.24]:52058 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754046AbcLPBMf (ORCPT ); Thu, 15 Dec 2016 20:12:35 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.33,355,1477983600"; d="scan'208";a="1099802858" From: "Li, Liang Z" To: "Michael S. Tsirkin" , "Hansen, Dave" CC: Andrea Arcangeli , David Hildenbrand , "kvm@vger.kernel.org" , "mhocko@suse.com" , "linux-kernel@vger.kernel.org" , "qemu-devel@nongnu.org" , "linux-mm@kvack.org" , "dgilbert@redhat.com" , "pbonzini@redhat.com" , "akpm@linux-foundation.org" , "virtualization@lists.linux-foundation.org" , "kirill.shutemov@linux.intel.com" Subject: RE: [Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration Thread-Topic: [Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration Thread-Index: AQHSSufi4SQjg1m8CEK5jUmkErPNiaD6HPWAgAJj94D//6QTgIAAAOqAgAAKNYCAAAnRgIAAHEuAgAAVUACAAAl0AIAClagg//+JqQCACJvLIIABhXGAgAAFqwCAARsj0A== Date: Fri, 16 Dec 2016 01:12:21 +0000 Message-ID: References: <0b83db29-ebad-2a70-8d61-756d33e33a48@intel.com> <2171e091-46ee-decd-7348-772555d3a5e3@redhat.com> <20161207183817.GE28786@redhat.com> <20161207202824.GH28786@redhat.com> <060287c7-d1af-45d5-70ea-ad35d4bbeb84@intel.com> <01886693-c73e-3696-860b-086417d695e1@intel.com> <20161215173901-mutt-send-email-mst@kernel.org> In-Reply-To: <20161215173901-mutt-send-email-mst@kernel.org> Accept-Language: zh-CN, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiYzZmMWNhMmUtYWIxZi00ZTRlLWIxYmYtY2M0MjI1Y2JhYjI1IiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX0lDIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE1LjkuNi42IiwiVHJ1c3RlZExhYmVsSGFzaCI6IklUUldqZjNQRzRyNEdoT1lDNlhkMWVRODAxazBQRlhSYUxCZFBnYW8ybFk9In0= x-ctpclassification: CTP_IC x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > On Thu, Dec 15, 2016 at 07:34:33AM -0800, Dave Hansen wrote: > > On 12/14/2016 12:59 AM, Li, Liang Z wrote: > > >> Subject: Re: [Qemu-devel] [PATCH kernel v5 0/5] Extend > > >> virtio-balloon for fast (de)inflating & fast live migration > > >> > > >> On 12/08/2016 08:45 PM, Li, Liang Z wrote: > > >>> What's the conclusion of your discussion? It seems you want some > > >>> statistic before deciding whether to ripping the bitmap from the > > >>> ABI, am I right? > > >> > > >> I think Andrea and David feel pretty strongly that we should remove > > >> the bitmap, unless we have some data to support keeping it. I > > >> don't feel as strongly about it, but I think their critique of it > > >> is pretty valid. I think the consensus is that the bitmap needs to go. > > >> > > >> The only real question IMNHO is whether we should do a power-of-2 > > >> or a length. But, if we have 12 bits, then the argument for doing > > >> length is pretty strong. We don't need anywhere near 12 bits if doing > power-of-2. > > > > > > Just found the MAX_ORDER should be limited to 12 if use length > > > instead of order, If the MAX_ORDER is configured to a value bigger > > > than 12, it will make things more complex to handle this case. > > > > > > If use order, we need to break a large memory range whose length is > > > not the power of 2 into several small ranges, it also make the code > complex. > > > > I can't imagine it makes the code that much more complex. It adds a > > for loop. Right? > > > > > It seems we leave too many bit for the pfn, and the bits leave for > > > length is not enough, How about keep 45 bits for the pfn and 19 bits > > > for length, 45 bits for pfn can cover 57 bits physical address, that should > be enough in the near feature. > > > > > > What's your opinion? > > > > I still think 'order' makes a lot of sense. But, as you say, 57 bits > > is enough for x86 for a while. Other architectures.... who knows? > > I think you can probably assume page size >= 4K. But I would not want to > make any other assumptions. E.g. there are systems that absolutely require > you to set high bits for DMA. > > I think we really want both length and order. > > I understand how you are trying to pack them as tightly as possible. > > However, I thought of a trick, we don't need to encode all possible orders. > For example, with 2 bits of order, we can make them mean: > 00 - 4K pages > 01 - 2M pages > 02 - 1G pages > > guest can program the sizes for each order through config space. > > We will have 10 bits left for legth. > Please don't, we just get rid of the bitmap for simplification. :) > It might make sense to also allow guest to program the number of bits used > for order, this will make it easy to extend without host changes. > There still exist the case if the MAX_ORDER is configured to a large value, e.g. 36 for a system with huge amount of memory, then there is only 28 bits left for the pfn, which is not enough. Should we limit the MAX_ORDER? I don't think so. It seems use order is better. Thanks! Liang > -- > MST From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Li, Liang Z" Subject: RE: [Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration Date: Fri, 16 Dec 2016 01:12:21 +0000 Message-ID: References: <0b83db29-ebad-2a70-8d61-756d33e33a48@intel.com> <2171e091-46ee-decd-7348-772555d3a5e3@redhat.com> <20161207183817.GE28786@redhat.com> <20161207202824.GH28786@redhat.com> <060287c7-d1af-45d5-70ea-ad35d4bbeb84@intel.com> <01886693-c73e-3696-860b-086417d695e1@intel.com> <20161215173901-mutt-send-email-mst@kernel.org> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Cc: Andrea Arcangeli , "mhocko@suse.com" , "kvm@vger.kernel.org" , "qemu-devel@nongnu.org" , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "kirill.shutemov@linux.intel.com" , "pbonzini@redhat.com" , "akpm@linux-foundation.org" , "virtualization@lists.linux-foundation.org" , "dgilbert@redhat.com" To: "Michael S. Tsirkin" , "Hansen, Dave" Return-path: In-Reply-To: <20161215173901-mutt-send-email-mst@kernel.org> Content-Language: en-US List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: virtualization-bounces@lists.linux-foundation.org Errors-To: virtualization-bounces@lists.linux-foundation.org List-Id: kvm.vger.kernel.org > On Thu, Dec 15, 2016 at 07:34:33AM -0800, Dave Hansen wrote: > > On 12/14/2016 12:59 AM, Li, Liang Z wrote: > > >> Subject: Re: [Qemu-devel] [PATCH kernel v5 0/5] Extend > > >> virtio-balloon for fast (de)inflating & fast live migration > > >> > > >> On 12/08/2016 08:45 PM, Li, Liang Z wrote: > > >>> What's the conclusion of your discussion? It seems you want some > > >>> statistic before deciding whether to ripping the bitmap from the > > >>> ABI, am I right? > > >> > > >> I think Andrea and David feel pretty strongly that we should remove > > >> the bitmap, unless we have some data to support keeping it. I > > >> don't feel as strongly about it, but I think their critique of it > > >> is pretty valid. I think the consensus is that the bitmap needs to go. > > >> > > >> The only real question IMNHO is whether we should do a power-of-2 > > >> or a length. But, if we have 12 bits, then the argument for doing > > >> length is pretty strong. We don't need anywhere near 12 bits if doing > power-of-2. > > > > > > Just found the MAX_ORDER should be limited to 12 if use length > > > instead of order, If the MAX_ORDER is configured to a value bigger > > > than 12, it will make things more complex to handle this case. > > > > > > If use order, we need to break a large memory range whose length is > > > not the power of 2 into several small ranges, it also make the code > complex. > > > > I can't imagine it makes the code that much more complex. It adds a > > for loop. Right? > > > > > It seems we leave too many bit for the pfn, and the bits leave for > > > length is not enough, How about keep 45 bits for the pfn and 19 bits > > > for length, 45 bits for pfn can cover 57 bits physical address, that should > be enough in the near feature. > > > > > > What's your opinion? > > > > I still think 'order' makes a lot of sense. But, as you say, 57 bits > > is enough for x86 for a while. Other architectures.... who knows? > > I think you can probably assume page size >= 4K. But I would not want to > make any other assumptions. E.g. there are systems that absolutely require > you to set high bits for DMA. > > I think we really want both length and order. > > I understand how you are trying to pack them as tightly as possible. > > However, I thought of a trick, we don't need to encode all possible orders. > For example, with 2 bits of order, we can make them mean: > 00 - 4K pages > 01 - 2M pages > 02 - 1G pages > > guest can program the sizes for each order through config space. > > We will have 10 bits left for legth. > Please don't, we just get rid of the bitmap for simplification. :) > It might make sense to also allow guest to program the number of bits used > for order, this will make it easy to extend without host changes. > There still exist the case if the MAX_ORDER is configured to a large value, e.g. 36 for a system with huge amount of memory, then there is only 28 bits left for the pfn, which is not enough. Should we limit the MAX_ORDER? I don't think so. It seems use order is better. Thanks! Liang > -- > MST From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pg0-f72.google.com (mail-pg0-f72.google.com [74.125.83.72]) by kanga.kvack.org (Postfix) with ESMTP id 026206B0069 for ; Thu, 15 Dec 2016 20:12:26 -0500 (EST) Received: by mail-pg0-f72.google.com with SMTP id q10so147595584pgq.7 for ; Thu, 15 Dec 2016 17:12:25 -0800 (PST) Received: from mga06.intel.com (mga06.intel.com. [134.134.136.31]) by mx.google.com with ESMTPS id w16si4758226pgc.313.2016.12.15.17.12.24 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 15 Dec 2016 17:12:25 -0800 (PST) From: "Li, Liang Z" Subject: RE: [Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration Date: Fri, 16 Dec 2016 01:12:21 +0000 Message-ID: References: <0b83db29-ebad-2a70-8d61-756d33e33a48@intel.com> <2171e091-46ee-decd-7348-772555d3a5e3@redhat.com> <20161207183817.GE28786@redhat.com> <20161207202824.GH28786@redhat.com> <060287c7-d1af-45d5-70ea-ad35d4bbeb84@intel.com> <01886693-c73e-3696-860b-086417d695e1@intel.com> <20161215173901-mutt-send-email-mst@kernel.org> In-Reply-To: <20161215173901-mutt-send-email-mst@kernel.org> Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Sender: owner-linux-mm@kvack.org List-ID: To: "Michael S. Tsirkin" , "Hansen, Dave" Cc: Andrea Arcangeli , David Hildenbrand , "kvm@vger.kernel.org" , "mhocko@suse.com" , "linux-kernel@vger.kernel.org" , "qemu-devel@nongnu.org" , "linux-mm@kvack.org" , "dgilbert@redhat.com" , "pbonzini@redhat.com" , "akpm@linux-foundation.org" , "virtualization@lists.linux-foundation.org" , "kirill.shutemov@linux.intel.com" > On Thu, Dec 15, 2016 at 07:34:33AM -0800, Dave Hansen wrote: > > On 12/14/2016 12:59 AM, Li, Liang Z wrote: > > >> Subject: Re: [Qemu-devel] [PATCH kernel v5 0/5] Extend > > >> virtio-balloon for fast (de)inflating & fast live migration > > >> > > >> On 12/08/2016 08:45 PM, Li, Liang Z wrote: > > >>> What's the conclusion of your discussion? It seems you want some > > >>> statistic before deciding whether to ripping the bitmap from the > > >>> ABI, am I right? > > >> > > >> I think Andrea and David feel pretty strongly that we should remove > > >> the bitmap, unless we have some data to support keeping it. I > > >> don't feel as strongly about it, but I think their critique of it > > >> is pretty valid. I think the consensus is that the bitmap needs to = go. > > >> > > >> The only real question IMNHO is whether we should do a power-of-2 > > >> or a length. But, if we have 12 bits, then the argument for doing > > >> length is pretty strong. We don't need anywhere near 12 bits if doi= ng > power-of-2. > > > > > > Just found the MAX_ORDER should be limited to 12 if use length > > > instead of order, If the MAX_ORDER is configured to a value bigger > > > than 12, it will make things more complex to handle this case. > > > > > > If use order, we need to break a large memory range whose length is > > > not the power of 2 into several small ranges, it also make the code > complex. > > > > I can't imagine it makes the code that much more complex. It adds a > > for loop. Right? > > > > > It seems we leave too many bit for the pfn, and the bits leave for > > > length is not enough, How about keep 45 bits for the pfn and 19 bits > > > for length, 45 bits for pfn can cover 57 bits physical address, that = should > be enough in the near feature. > > > > > > What's your opinion? > > > > I still think 'order' makes a lot of sense. But, as you say, 57 bits > > is enough for x86 for a while. Other architectures.... who knows? >=20 > I think you can probably assume page size >=3D 4K. But I would not want t= o > make any other assumptions. E.g. there are systems that absolutely requir= e > you to set high bits for DMA. >=20 > I think we really want both length and order. >=20 > I understand how you are trying to pack them as tightly as possible. >=20 > However, I thought of a trick, we don't need to encode all possible order= s. > For example, with 2 bits of order, we can make them mean: > 00 - 4K pages > 01 - 2M pages > 02 - 1G pages >=20 > guest can program the sizes for each order through config space. >=20 > We will have 10 bits left for legth. >=20 Please don't, we just get rid of the bitmap for simplification. :) > It might make sense to also allow guest to program the number of bits use= d > for order, this will make it easy to extend without host changes. >=20 There still exist the case if the MAX_ORDER is configured to a large value,= e.g. 36 for a system with huge amount of memory, then there is only 28 bits left for the pfn, wh= ich is not enough. Should we limit the MAX_ORDER? I don't think so. It seems use order is better.=20 Thanks! Liang > -- > MST -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:49170) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cHh4Z-000878-B6 for qemu-devel@nongnu.org; Thu, 15 Dec 2016 20:12:32 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cHh4U-0006Qa-Dd for qemu-devel@nongnu.org; Thu, 15 Dec 2016 20:12:31 -0500 Received: from mga06.intel.com ([134.134.136.31]:53882) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cHh4U-0006Pf-3N for qemu-devel@nongnu.org; Thu, 15 Dec 2016 20:12:26 -0500 From: "Li, Liang Z" Date: Fri, 16 Dec 2016 01:12:21 +0000 Message-ID: References: <0b83db29-ebad-2a70-8d61-756d33e33a48@intel.com> <2171e091-46ee-decd-7348-772555d3a5e3@redhat.com> <20161207183817.GE28786@redhat.com> <20161207202824.GH28786@redhat.com> <060287c7-d1af-45d5-70ea-ad35d4bbeb84@intel.com> <01886693-c73e-3696-860b-086417d695e1@intel.com> <20161215173901-mutt-send-email-mst@kernel.org> In-Reply-To: <20161215173901-mutt-send-email-mst@kernel.org> Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" , "Hansen, Dave" Cc: Andrea Arcangeli , David Hildenbrand , "kvm@vger.kernel.org" , "mhocko@suse.com" , "linux-kernel@vger.kernel.org" , "qemu-devel@nongnu.org" , "linux-mm@kvack.org" , "dgilbert@redhat.com" , "pbonzini@redhat.com" , "akpm@linux-foundation.org" , "virtualization@lists.linux-foundation.org" , "kirill.shutemov@linux.intel.com" > On Thu, Dec 15, 2016 at 07:34:33AM -0800, Dave Hansen wrote: > > On 12/14/2016 12:59 AM, Li, Liang Z wrote: > > >> Subject: Re: [Qemu-devel] [PATCH kernel v5 0/5] Extend > > >> virtio-balloon for fast (de)inflating & fast live migration > > >> > > >> On 12/08/2016 08:45 PM, Li, Liang Z wrote: > > >>> What's the conclusion of your discussion? It seems you want some > > >>> statistic before deciding whether to ripping the bitmap from the > > >>> ABI, am I right? > > >> > > >> I think Andrea and David feel pretty strongly that we should remove > > >> the bitmap, unless we have some data to support keeping it. I > > >> don't feel as strongly about it, but I think their critique of it > > >> is pretty valid. I think the consensus is that the bitmap needs to = go. > > >> > > >> The only real question IMNHO is whether we should do a power-of-2 > > >> or a length. But, if we have 12 bits, then the argument for doing > > >> length is pretty strong. We don't need anywhere near 12 bits if doi= ng > power-of-2. > > > > > > Just found the MAX_ORDER should be limited to 12 if use length > > > instead of order, If the MAX_ORDER is configured to a value bigger > > > than 12, it will make things more complex to handle this case. > > > > > > If use order, we need to break a large memory range whose length is > > > not the power of 2 into several small ranges, it also make the code > complex. > > > > I can't imagine it makes the code that much more complex. It adds a > > for loop. Right? > > > > > It seems we leave too many bit for the pfn, and the bits leave for > > > length is not enough, How about keep 45 bits for the pfn and 19 bits > > > for length, 45 bits for pfn can cover 57 bits physical address, that = should > be enough in the near feature. > > > > > > What's your opinion? > > > > I still think 'order' makes a lot of sense. But, as you say, 57 bits > > is enough for x86 for a while. Other architectures.... who knows? >=20 > I think you can probably assume page size >=3D 4K. But I would not want t= o > make any other assumptions. E.g. there are systems that absolutely requir= e > you to set high bits for DMA. >=20 > I think we really want both length and order. >=20 > I understand how you are trying to pack them as tightly as possible. >=20 > However, I thought of a trick, we don't need to encode all possible order= s. > For example, with 2 bits of order, we can make them mean: > 00 - 4K pages > 01 - 2M pages > 02 - 1G pages >=20 > guest can program the sizes for each order through config space. >=20 > We will have 10 bits left for legth. >=20 Please don't, we just get rid of the bitmap for simplification. :) > It might make sense to also allow guest to program the number of bits use= d > for order, this will make it easy to extend without host changes. >=20 There still exist the case if the MAX_ORDER is configured to a large value,= e.g. 36 for a system with huge amount of memory, then there is only 28 bits left for the pfn, wh= ich is not enough. Should we limit the MAX_ORDER? I don't think so. It seems use order is better.=20 Thanks! Liang > -- > MST