From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:49735) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UK5YF-00038c-Ok for qemu-devel@nongnu.org; Mon, 25 Mar 2013 07:26:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UK5YC-0006xC-6X for qemu-devel@nongnu.org; Mon, 25 Mar 2013 07:26:55 -0400 Received: from mx.ipv6.kamp.de ([2a02:248:0:51::16]:55712 helo=mx01.kamp.de) by eggs.gnu.org with smtp (Exim 4.71) (envelope-from ) id 1UK5YB-0006wg-TR for qemu-devel@nongnu.org; Mon, 25 Mar 2013 07:26:52 -0400 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) From: Peter Lieven In-Reply-To: <1328150503.13028038.1364208799172.JavaMail.root@redhat.com> Date: Mon, 25 Mar 2013 12:26:47 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: <1328150503.13028038.1364208799172.JavaMail.root@redhat.com> Subject: Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: Orit Wasserman , quintela@redhat.com, qemu-devel@nongnu.org, Stefan Hajnoczi Am 25.03.2013 um 11:53 schrieb Paolo Bonzini : >> ubuntu 12.04 LTS 64-bit desktop with 1G memory shortly after boot: >> histogram: 31.7% 32.9% [...] 36.4% 100.0% >>=20 >> --- >>=20 >> opensuse 11.1 64-bit with 24GB ram (busy server) >> histogram: 97.5% 97.9% [...] 99.5% 100.0% >>=20 >> --- >>=20 >> windows server 2008 R2 with 8G ram running for 3 days: >> histogram: 20.9% 21.3% [...] 22.5% 100.0% >>=20 >> --- >>=20 >> windows XP guest with 1G Ram running for approx. 1 hours >> histogram: 25.6% [...] 35.8% 100.0% >=20 > Doesn't this suggest checking the first _and the last_ word, > and using the vectorized loop if none is zero? Maybe I should have explained the output more detailed. The percentages = are added. 35.8% in the second last column means that 35.8% have a = return value that is less than TARGET_PAGE_SIZE. This was meant to illustrate = at how many 64-bit chunks you have to look to grab a certain percentage of non-zero pages. 25.6% 26.1% 34.0% 34.1% 34.2% 34.3% 34.3% 34.4% 34.4% 34.4% 34.5% 34.5% = 34.5% [...] 35.8% 100% Looking e.g. at the third value it means that looking at the first three = 64-bit chunks it will catch 34.0% of all pages. It turns out that the non-zeroness of a page can be detected looking at = the first 256 or so bits and only a low percentage turns out to be non-zero at a later position. So after having = checked the first chunks one by one there is no big penalty looking at the remaining chunks with the = vectorized loop. Here is the distribution of return values for the Windows XP example: 25.62% 0.49% 7.86% 0.12% 0.15% 0.05% 0.05% 0.04% 0.05% 0.02% 0.03% 0.02% = 0.03% 0.02% 0.02% 0.01% 0.03% 0.02% 0.01% 0.02% 0.02% 0.01% 0.02% 0.01% = 0.01% 0.01% 0.01% 0.01% 0.02% 0.00% 0.01% 0.02% 0.03% 0.01% 0.01% 0.01% = 0.01% 0.01% 0.01% 0.00% 0.00% 0.01% 0.07% 0.00% 0.00% 0.00% 0.00% 0.00% = 0.01% 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.01% = 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% = 0.01% 0.02% 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.01% 0.00% 0.00% 0.00% = 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% = 0.02% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% = 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% = 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.03% 0.00% 0.00% 0.00% = 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% = 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% = 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% = 0.00% 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% = 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% = 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% = 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% = 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% = 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% = 0.02% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% = 0.01% 0.00% 0.00% 0.00% 0.02% 0.00% 0.00% 0.02% 0.00% 0.00% 0.00% 0.00% = 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% = 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% = 0.01% 0.00% 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% = 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% = 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% = 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% = 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% = 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% = 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% = 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% = 0.01% 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% = 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% = 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% = 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% = 0.01% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% = 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% = 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.01% 0.00% 0.00% 0.00% = 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% = 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% = 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% = 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 64.23% The last value is the percentage of return value of TARGET_PAGE_SIZE = meaning the page is all zero. Peter