From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756862AbcLPBMz (ORCPT ); Thu, 15 Dec 2016 20:12:55 -0500 Received: from mga14.intel.com ([192.55.52.115]:45265 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753017AbcLPBMf (ORCPT ); Thu, 15 Dec 2016 20:12:35 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.33,355,1477983600"; d="scan'208";a="42906608" Subject: Re: [Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration To: "Li, Liang Z" , Andrea Arcangeli References: <1480495397-23225-1-git-send-email-liang.z.li@intel.com> <0b18c636-ee67-cbb4-1ba3-81a06150db76@redhat.com> <0b83db29-ebad-2a70-8d61-756d33e33a48@intel.com> <2171e091-46ee-decd-7348-772555d3a5e3@redhat.com> <20161207183817.GE28786@redhat.com> <20161207202824.GH28786@redhat.com> <060287c7-d1af-45d5-70ea-ad35d4bbeb84@intel.com> <01886693-c73e-3696-860b-086417d695e1@intel.com> Cc: David Hildenbrand , "kvm@vger.kernel.org" , "mhocko@suse.com" , "mst@redhat.com" , "linux-kernel@vger.kernel.org" , "qemu-devel@nongnu.org" , "linux-mm@kvack.org" , "dgilbert@redhat.com" , "pbonzini@redhat.com" , "akpm@linux-foundation.org" , "virtualization@lists.linux-foundation.org" , "kirill.shutemov@linux.intel.com" From: Dave Hansen Message-ID: Date: Thu, 15 Dec 2016 17:09:10 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 12/15/2016 04:48 PM, Li, Liang Z wrote: >>> It seems we leave too many bit for the pfn, and the bits leave for >>> length is not enough, How about keep 45 bits for the pfn and 19 bits >>> for length, 45 bits for pfn can cover 57 bits physical address, that should be >> enough in the near feature. >>> What's your opinion? >> I still think 'order' makes a lot of sense. But, as you say, 57 bits is enough for >> x86 for a while. Other architectures.... who knows? Thinking about this some more... There are really only two cases that matter: 4k pages and "much bigger" ones. Squeezing each 4k page into 8 bytes of metadata helps guarantee that this scheme won't regress over the old scheme in any cases. For bigger ranges, 8 vs 16 bytes means *nothing*. And 16 bytes will be as good or better than the old scheme for everything which is >4k. How about this: * 52 bits of 'pfn', 5 bits of 'order', 7 bits of 'length' * One special 'length' value to mean "actual length in next 8 bytes" That should be pretty simple to produce and decode. We have two record sizes, but I think it is manageable.