From mboxrd@z Thu Jan 1 00:00:00 1970 From: Gleb Natapov Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages Date: Wed, 24 Jul 2013 13:19:33 +0300 Message-ID: <20130724101932.GA14229@redhat.com> References: <6A3DF150A5B70D4F9B66A25E3F7C888D070D6E79@039-SN2MPN1-013.039d.mgd.msft.net> <51E7BE90.60300@windriver.com> <50683BE9-8BDD-4730-B866-F972BCF1EDD6@suse.de> <51E7C14C.6060600@windriver.com> <51EF3B67.5060403@windriver.com> <597D9B3F-BCE8-4483-B485-3035D6D443AC@suse.de> <20130724100104.GE16400@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: =?cp1255?B?IpN0aWVqdW4uY2hlbpQi?= , Bhushan Bharat-R65777 , kvm-ppc@vger.kernel.org, "kvm@vger.kernel.org list" , Wood Scott-B07421 , Paolo Bonzini , Andrea Arcangeli To: Alexander Graf Return-path: Content-Disposition: inline In-Reply-To: Sender: kvm-ppc-owner@vger.kernel.org List-Id: kvm.vger.kernel.org On Wed, Jul 24, 2013 at 12:09:42PM +0200, Alexander Graf wrote: > > On 24.07.2013, at 12:01, Gleb Natapov wrote: > > > Copying Andrea for him to verify that I am not talking nonsense :) > > > > On Wed, Jul 24, 2013 at 10:25:20AM +0200, Alexander Graf wrote: > >>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > >>> index 1580dd4..5e8635b 100644 > >>> --- a/virt/kvm/kvm_main.c > >>> +++ b/virt/kvm/kvm_main.c > >>> @@ -102,6 +102,10 @@ static bool largepages_enabled = true; > >>> > >>> bool kvm_is_mmio_pfn(pfn_t pfn) > >>> { > >>> +#ifdef CONFIG_MEMORY_HOTPLUG > >> > >> I'd feel safer if we narrow this down to e500. > >> > >>> + /* > >>> + * Currently only in memory hot remove case we may still need this. > >>> + */ > >>> if (pfn_valid(pfn)) { > >> > >> We still have to check for pfn_valid, no? So the #ifdef should be down here. > >> > >>> int reserved; > >>> struct page *tail = pfn_to_page(pfn); > >>> @@ -124,6 +128,7 @@ bool kvm_is_mmio_pfn(pfn_t pfn) > >>> } > >>> return PageReserved(tail); > >>> } > >>> +#endif > >>> > >>> return true; > >>> } > >>> > >>> Before apply this change: > >>> > >>> real (1m19.954s + 1m20.918s + 1m22.740s + 1m21.146s + 1m22.120s)/5= 1m21.376s > >>> user (0m23.181s + 0m23.550s + 0m23.506s + 0m23.410s + 0m23.520s)/5= 0m23.433s > >>> sys (0m49.087s + 0m49.563s + 0m51.758s + 0m50.290s + 0m51.047s)/5= 0m50.349s > >>> > >>> After apply this change: > >>> > >>> real (1m19.507s + 1m20.919s + 1m21.436s + 1m21.179s + 1m20.293s)/5= 1m20.667s > >>> user (0m22.595s + 0m22.719s + 0m22.484s + 0m22.811s + 0m22.467s)/5= 0m22.615s > >>> sys (0m48.841s + 0m49.929s + 0m50.310s + 0m49.813s + 0m48.587s)/5= 0m49.496s > >>> > >>> So, > >>> > >>> real (1m20.667s - 1m21.376s)/1m21.376s x 100% = -0.6% > >>> user (0m22.615s - 0m23.433s)/0m23.433s x 100% = -3.5% > >>> sys (0m49.496s - 0m50.349s)/0m50.349s x 100% = -1.7% > >> > >> Very nice, so there is a real world performance benefit to doing this. Then yes, I think it would make sense to change the global helper function to be fast on e500 and use that one from e500_shadow_mas2_attrib() instead. > >> > >> Gleb, Paolo, any hard feelings? > >> > > I do not see how can we break the function in such a way and get > > away with it. Not all valid pfns point to memory. Physical address can > > be sparse (due to PCI hole, framebuffer or just because). > > But we don't check for sparseness today in here either. We merely check for incomplete huge pages. > That's not how I read the code. The code checks for reserved flag set. It should be set on pfns that point to memory holes. As far as I understand huge page tricks they are there to guaranty that THP does not change flags under us, Andrea? -- Gleb. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Gleb Natapov Date: Wed, 24 Jul 2013 10:19:33 +0000 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages Message-Id: <20130724101932.GA14229@redhat.com> List-Id: References: <6A3DF150A5B70D4F9B66A25E3F7C888D070D6E79@039-SN2MPN1-013.039d.mgd.msft.net> <51E7BE90.60300@windriver.com> <50683BE9-8BDD-4730-B866-F972BCF1EDD6@suse.de> <51E7C14C.6060600@windriver.com> <51EF3B67.5060403@windriver.com> <597D9B3F-BCE8-4483-B485-3035D6D443AC@suse.de> <20130724100104.GE16400@redhat.com> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Alexander Graf Cc: =?cp1255?B?IpN0aWVqdW4uY2hlbpQi?= , Bhushan Bharat-R65777 , kvm-ppc@vger.kernel.org, "kvm@vger.kernel.org list" , Wood Scott-B07421 , Paolo Bonzini , Andrea Arcangeli On Wed, Jul 24, 2013 at 12:09:42PM +0200, Alexander Graf wrote: > > On 24.07.2013, at 12:01, Gleb Natapov wrote: > > > Copying Andrea for him to verify that I am not talking nonsense :) > > > > On Wed, Jul 24, 2013 at 10:25:20AM +0200, Alexander Graf wrote: > >>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > >>> index 1580dd4..5e8635b 100644 > >>> --- a/virt/kvm/kvm_main.c > >>> +++ b/virt/kvm/kvm_main.c > >>> @@ -102,6 +102,10 @@ static bool largepages_enabled = true; > >>> > >>> bool kvm_is_mmio_pfn(pfn_t pfn) > >>> { > >>> +#ifdef CONFIG_MEMORY_HOTPLUG > >> > >> I'd feel safer if we narrow this down to e500. > >> > >>> + /* > >>> + * Currently only in memory hot remove case we may still need this. > >>> + */ > >>> if (pfn_valid(pfn)) { > >> > >> We still have to check for pfn_valid, no? So the #ifdef should be down here. > >> > >>> int reserved; > >>> struct page *tail = pfn_to_page(pfn); > >>> @@ -124,6 +128,7 @@ bool kvm_is_mmio_pfn(pfn_t pfn) > >>> } > >>> return PageReserved(tail); > >>> } > >>> +#endif > >>> > >>> return true; > >>> } > >>> > >>> Before apply this change: > >>> > >>> real (1m19.954s + 1m20.918s + 1m22.740s + 1m21.146s + 1m22.120s)/5= 1m21.376s > >>> user (0m23.181s + 0m23.550s + 0m23.506s + 0m23.410s + 0m23.520s)/5= 0m23.433s > >>> sys (0m49.087s + 0m49.563s + 0m51.758s + 0m50.290s + 0m51.047s)/5= 0m50.349s > >>> > >>> After apply this change: > >>> > >>> real (1m19.507s + 1m20.919s + 1m21.436s + 1m21.179s + 1m20.293s)/5= 1m20.667s > >>> user (0m22.595s + 0m22.719s + 0m22.484s + 0m22.811s + 0m22.467s)/5= 0m22.615s > >>> sys (0m48.841s + 0m49.929s + 0m50.310s + 0m49.813s + 0m48.587s)/5= 0m49.496s > >>> > >>> So, > >>> > >>> real (1m20.667s - 1m21.376s)/1m21.376s x 100% = -0.6% > >>> user (0m22.615s - 0m23.433s)/0m23.433s x 100% = -3.5% > >>> sys (0m49.496s - 0m50.349s)/0m50.349s x 100% = -1.7% > >> > >> Very nice, so there is a real world performance benefit to doing this. Then yes, I think it would make sense to change the global helper function to be fast on e500 and use that one from e500_shadow_mas2_attrib() instead. > >> > >> Gleb, Paolo, any hard feelings? > >> > > I do not see how can we break the function in such a way and get > > away with it. Not all valid pfns point to memory. Physical address can > > be sparse (due to PCI hole, framebuffer or just because). > > But we don't check for sparseness today in here either. We merely check for incomplete huge pages. > That's not how I read the code. The code checks for reserved flag set. It should be set on pfns that point to memory holes. As far as I understand huge page tricks they are there to guaranty that THP does not change flags under us, Andrea? -- Gleb.