From mboxrd@z Thu Jan 1 00:00:00 1970 From: Gleb Natapov Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages Date: Wed, 24 Jul 2013 13:01:04 +0300 Message-ID: <20130724100104.GE16400@redhat.com> References: <6A3DF150A5B70D4F9B66A25E3F7C888D070D6DAB@039-SN2MPN1-013.039d.mgd.msft.net> <51E7A5A2.5040107@windriver.com> <6A3DF150A5B70D4F9B66A25E3F7C888D070D6E79@039-SN2MPN1-013.039d.mgd.msft.net> <51E7BE90.60300@windriver.com> <50683BE9-8BDD-4730-B866-F972BCF1EDD6@suse.de> <51E7C14C.6060600@windriver.com> <51EF3B67.5060403@windriver.com> <597D9B3F-BCE8-4483-B485-3035D6D443AC@suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: =?cp1255?B?IpN0aWVqdW4uY2hlbpQi?= , Bhushan Bharat-R65777 , kvm-ppc@vger.kernel.org, "kvm@vger.kernel.org list" , Wood Scott-B07421 , Paolo Bonzini , Andrea Arcangeli To: Alexander Graf Return-path: Content-Disposition: inline In-Reply-To: <597D9B3F-BCE8-4483-B485-3035D6D443AC@suse.de> Sender: kvm-ppc-owner@vger.kernel.org List-Id: kvm.vger.kernel.org Copying Andrea for him to verify that I am not talking nonsense :) On Wed, Jul 24, 2013 at 10:25:20AM +0200, Alexander Graf wrote: > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > > index 1580dd4..5e8635b 100644 > > --- a/virt/kvm/kvm_main.c > > +++ b/virt/kvm/kvm_main.c > > @@ -102,6 +102,10 @@ static bool largepages_enabled = true; > > > > bool kvm_is_mmio_pfn(pfn_t pfn) > > { > > +#ifdef CONFIG_MEMORY_HOTPLUG > > I'd feel safer if we narrow this down to e500. > > > + /* > > + * Currently only in memory hot remove case we may still need this. > > + */ > > if (pfn_valid(pfn)) { > > We still have to check for pfn_valid, no? So the #ifdef should be down here. > > > int reserved; > > struct page *tail = pfn_to_page(pfn); > > @@ -124,6 +128,7 @@ bool kvm_is_mmio_pfn(pfn_t pfn) > > } > > return PageReserved(tail); > > } > > +#endif > > > > return true; > > } > > > > Before apply this change: > > > > real (1m19.954s + 1m20.918s + 1m22.740s + 1m21.146s + 1m22.120s)/5= 1m21.376s > > user (0m23.181s + 0m23.550s + 0m23.506s + 0m23.410s + 0m23.520s)/5= 0m23.433s > > sys (0m49.087s + 0m49.563s + 0m51.758s + 0m50.290s + 0m51.047s)/5= 0m50.349s > > > > After apply this change: > > > > real (1m19.507s + 1m20.919s + 1m21.436s + 1m21.179s + 1m20.293s)/5= 1m20.667s > > user (0m22.595s + 0m22.719s + 0m22.484s + 0m22.811s + 0m22.467s)/5= 0m22.615s > > sys (0m48.841s + 0m49.929s + 0m50.310s + 0m49.813s + 0m48.587s)/5= 0m49.496s > > > > So, > > > > real (1m20.667s - 1m21.376s)/1m21.376s x 100% = -0.6% > > user (0m22.615s - 0m23.433s)/0m23.433s x 100% = -3.5% > > sys (0m49.496s - 0m50.349s)/0m50.349s x 100% = -1.7% > > Very nice, so there is a real world performance benefit to doing this. Then yes, I think it would make sense to change the global helper function to be fast on e500 and use that one from e500_shadow_mas2_attrib() instead. > > Gleb, Paolo, any hard feelings? > I do not see how can we break the function in such a way and get away with it. Not all valid pfns point to memory. Physical address can be sparse (due to PCI hole, framebuffer or just because). -- Gleb. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Gleb Natapov Date: Wed, 24 Jul 2013 10:01:04 +0000 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages Message-Id: <20130724100104.GE16400@redhat.com> List-Id: References: <6A3DF150A5B70D4F9B66A25E3F7C888D070D6DAB@039-SN2MPN1-013.039d.mgd.msft.net> <51E7A5A2.5040107@windriver.com> <6A3DF150A5B70D4F9B66A25E3F7C888D070D6E79@039-SN2MPN1-013.039d.mgd.msft.net> <51E7BE90.60300@windriver.com> <50683BE9-8BDD-4730-B866-F972BCF1EDD6@suse.de> <51E7C14C.6060600@windriver.com> <51EF3B67.5060403@windriver.com> <597D9B3F-BCE8-4483-B485-3035D6D443AC@suse.de> In-Reply-To: <597D9B3F-BCE8-4483-B485-3035D6D443AC@suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: Alexander Graf Cc: =?cp1255?B?IpN0aWVqdW4uY2hlbpQi?= , Bhushan Bharat-R65777 , kvm-ppc@vger.kernel.org, "kvm@vger.kernel.org list" , Wood Scott-B07421 , Paolo Bonzini , Andrea Arcangeli Copying Andrea for him to verify that I am not talking nonsense :) On Wed, Jul 24, 2013 at 10:25:20AM +0200, Alexander Graf wrote: > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > > index 1580dd4..5e8635b 100644 > > --- a/virt/kvm/kvm_main.c > > +++ b/virt/kvm/kvm_main.c > > @@ -102,6 +102,10 @@ static bool largepages_enabled = true; > > > > bool kvm_is_mmio_pfn(pfn_t pfn) > > { > > +#ifdef CONFIG_MEMORY_HOTPLUG > > I'd feel safer if we narrow this down to e500. > > > + /* > > + * Currently only in memory hot remove case we may still need this. > > + */ > > if (pfn_valid(pfn)) { > > We still have to check for pfn_valid, no? So the #ifdef should be down here. > > > int reserved; > > struct page *tail = pfn_to_page(pfn); > > @@ -124,6 +128,7 @@ bool kvm_is_mmio_pfn(pfn_t pfn) > > } > > return PageReserved(tail); > > } > > +#endif > > > > return true; > > } > > > > Before apply this change: > > > > real (1m19.954s + 1m20.918s + 1m22.740s + 1m21.146s + 1m22.120s)/5= 1m21.376s > > user (0m23.181s + 0m23.550s + 0m23.506s + 0m23.410s + 0m23.520s)/5= 0m23.433s > > sys (0m49.087s + 0m49.563s + 0m51.758s + 0m50.290s + 0m51.047s)/5= 0m50.349s > > > > After apply this change: > > > > real (1m19.507s + 1m20.919s + 1m21.436s + 1m21.179s + 1m20.293s)/5= 1m20.667s > > user (0m22.595s + 0m22.719s + 0m22.484s + 0m22.811s + 0m22.467s)/5= 0m22.615s > > sys (0m48.841s + 0m49.929s + 0m50.310s + 0m49.813s + 0m48.587s)/5= 0m49.496s > > > > So, > > > > real (1m20.667s - 1m21.376s)/1m21.376s x 100% = -0.6% > > user (0m22.615s - 0m23.433s)/0m23.433s x 100% = -3.5% > > sys (0m49.496s - 0m50.349s)/0m50.349s x 100% = -1.7% > > Very nice, so there is a real world performance benefit to doing this. Then yes, I think it would make sense to change the global helper function to be fast on e500 and use that one from e500_shadow_mas2_attrib() instead. > > Gleb, Paolo, any hard feelings? > I do not see how can we break the function in such a way and get away with it. Not all valid pfns point to memory. Physical address can be sparse (due to PCI hole, framebuffer or just because). -- Gleb.