From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dave Hansen Subject: Re: [Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration Date: Thu, 15 Dec 2016 17:09:10 -0800 Message-ID: References: <1480495397-23225-1-git-send-email-liang.z.li@intel.com> <0b18c636-ee67-cbb4-1ba3-81a06150db76@redhat.com> <0b83db29-ebad-2a70-8d61-756d33e33a48@intel.com> <2171e091-46ee-decd-7348-772555d3a5e3@redhat.com> <20161207183817.GE28786@redhat.com> <20161207202824.GH28786@redhat.com> <060287c7-d1af-45d5-70ea-ad35d4bbeb84@intel.com> <01886693-c73e-3696-860b-086417d695e1@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: virtualization-bounces@lists.linux-foundation.org Errors-To: virtualization-bounces@lists.linux-foundation.org To: "Li, Liang Z" , Andrea Arcangeli Cc: "mhocko@suse.com" , "kvm@vger.kernel.org" , "mst@redhat.com" , "qemu-devel@nongnu.org" , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "kirill.shutemov@linux.intel.com" , "pbonzini@redhat.com" , "akpm@linux-foundation.org" , "virtualization@lists.linux-foundation.org" , "dgilbert@redhat.com" List-Id: virtualization@lists.linuxfoundation.org On 12/15/2016 04:48 PM, Li, Liang Z wrote: >>> It seems we leave too many bit for the pfn, and the bits leave for >>> length is not enough, How about keep 45 bits for the pfn and 19 bits >>> for length, 45 bits for pfn can cover 57 bits physical address, that should be >> enough in the near feature. >>> What's your opinion? >> I still think 'order' makes a lot of sense. But, as you say, 57 bits is enough for >> x86 for a while. Other architectures.... who knows? Thinking about this some more... There are really only two cases that matter: 4k pages and "much bigger" ones. Squeezing each 4k page into 8 bytes of metadata helps guarantee that this scheme won't regress over the old scheme in any cases. For bigger ranges, 8 vs 16 bytes means *nothing*. And 16 bytes will be as good or better than the old scheme for everything which is >4k. How about this: * 52 bits of 'pfn', 5 bits of 'order', 7 bits of 'length' * One special 'length' value to mean "actual length in next 8 bytes" That should be pretty simple to produce and decode. We have two record sizes, but I think it is manageable.