From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alexander Graf Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages Date: Thu, 18 Jul 2013 12:27:49 +0200 Message-ID: References: <1374127456-9614-1-git-send-email-Bharat.Bhushan@freescale.com> <1374127456-9614-2-git-send-email-Bharat.Bhushan@freescale.com> <51E78A88.4000904@windriver.com> <6A3DF150A5B70D4F9B66A25E3F7C888D070D6AFF@039-SN2MPN1-013.039d.mgd.msft.net> <51E799D0.1030406@windriver.com> <6A3DF150A5B70D4F9B66A25E3F7C888D070D6DAB@039-SN2MPN1-013.039d.mgd.msft.net> <51E7A5A2.5040107@windriver.com> <6A3DF150A5B70D4F9B66A25E3F7C888D070D6E79@039-SN2MPN1-013.039d.mgd.msft.net> <51E7BE90.60300@windriver.com> <50683BE9-8BDD-4730-B866-F972BCF1EDD6@suse.de> <51E7C14C.6060600@windriver.com> Mime-Version: 1.0 (Apple Message framework v1278) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Bhushan Bharat-R65777 , "kvm-ppc@vger.kernel.org" , "kvm@vger.kernel.org" , Wood Scott-B07421 To: =?windows-1252?Q?=22=93tiejun=2Echen=94=22?= Return-path: In-Reply-To: <51E7C14C.6060600@windriver.com> Sender: kvm-ppc-owner@vger.kernel.org List-Id: kvm.vger.kernel.org On 18.07.2013, at 12:19, =E2=80=9Ctiejun.chen=E2=80=9D wrote: > On 07/18/2013 06:12 PM, Alexander Graf wrote: >>=20 >> On 18.07.2013, at 12:08, =E2=80=9Ctiejun.chen=E2=80=9D wrote: >>=20 >>> On 07/18/2013 05:48 PM, Alexander Graf wrote: >>>>=20 >>>> On 18.07.2013, at 10:25, Bhushan Bharat-R65777 wrote: >>>>=20 >>>>>=20 >>>>>=20 >>>>>> -----Original Message----- >>>>>> From: Bhushan Bharat-R65777 >>>>>> Sent: Thursday, July 18, 2013 1:53 PM >>>>>> To: '"=EF=BF=BDtiejun.chen=EF=BF=BD"' >>>>>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; agraf@suse.de;= Wood Scott- >>>>>> B07421 >>>>>> Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only = for kernel >>>>>> managed pages >>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>>> -----Original Message----- >>>>>>> From: "=EF=BF=BDtiejun.chen=EF=BF=BD" [mailto:tiejun.chen@windr= iver.com] >>>>>>> Sent: Thursday, July 18, 2013 1:52 PM >>>>>>> To: Bhushan Bharat-R65777 >>>>>>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; agraf@suse.de= ; Wood >>>>>>> Scott- >>>>>>> B07421 >>>>>>> Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only= for >>>>>>> kernel managed pages >>>>>>>=20 >>>>>>> On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: >>>>>>>>=20 >>>>>>>>=20 >>>>>>>>> -----Original Message----- >>>>>>>>> From: kvm-ppc-owner@vger.kernel.org >>>>>>>>> [mailto:kvm-ppc-owner@vger.kernel.org] On Behalf Of "=EF=BF=BD= tiejun.chen=EF=BF=BD" >>>>>>>>> Sent: Thursday, July 18, 2013 1:01 PM >>>>>>>>> To: Bhushan Bharat-R65777 >>>>>>>>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; agraf@suse.= de; >>>>>>>>> Wood >>>>>>>>> Scott- >>>>>>>>> B07421 >>>>>>>>> Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency on= ly for >>>>>>>>> kernel managed pages >>>>>>>>>=20 >>>>>>>>> On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: >>>>>>>>>>=20 >>>>>>>>>>=20 >>>>>>>>>>> -----Original Message----- >>>>>>>>>>> From: "=EF=BF=BDtiejun.chen=EF=BF=BD" [mailto:tiejun.chen@w= indriver.com] >>>>>>>>>>> Sent: Thursday, July 18, 2013 11:56 AM >>>>>>>>>>> To: Bhushan Bharat-R65777 >>>>>>>>>>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; agraf@sus= e.de; >>>>>>>>>>> Wood >>>>>>>>>>> Scott- B07421; Bhushan Bharat-R65777 >>>>>>>>>>> Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency = only >>>>>>>>>>> for kernel managed pages >>>>>>>>>>>=20 >>>>>>>>>>> On 07/18/2013 02:04 PM, Bharat Bhushan wrote: >>>>>>>>>>>> If there is a struct page for the requested mapping then i= t's >>>>>>>>>>>> normal DDR and the mapping sets "M" bit (coherent, cacheab= le) >>>>>>>>>>>> else this is treated as I/O and we set "I + G" (cache >>>>>>>>>>>> inhibited, >>>>>>>>>>>> guarded) >>>>>>>>>>>>=20 >>>>>>>>>>>> This helps setting proper TLB mapping for direct assigned = device >>>>>>>>>>>>=20 >>>>>>>>>>>> Signed-off-by: Bharat Bhushan >>>>>>>>>>>> --- >>>>>>>>>>>> arch/powerpc/kvm/e500_mmu_host.c | 17 ++++++++++++---= -- >>>>>>>>>>>> 1 files changed, 12 insertions(+), 5 deletions(-) >>>>>>>>>>>>=20 >>>>>>>>>>>> diff --git a/arch/powerpc/kvm/e500_mmu_host.c >>>>>>>>>>>> b/arch/powerpc/kvm/e500_mmu_host.c >>>>>>>>>>>> index 1c6a9d7..089c227 100644 >>>>>>>>>>>> --- a/arch/powerpc/kvm/e500_mmu_host.c >>>>>>>>>>>> +++ b/arch/powerpc/kvm/e500_mmu_host.c >>>>>>>>>>>> @@ -64,13 +64,20 @@ static inline u32 >>>>>>>>>>>> e500_shadow_mas3_attrib(u32 mas3, int >>>>>>>>>>> usermode) >>>>>>>>>>>> return mas3; >>>>>>>>>>>> } >>>>>>>>>>>>=20 >>>>>>>>>>>> -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int >>>>>>>>>>>> usermode) >>>>>>>>>>>> +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t= pfn) >>>>>>>>>>>> { >>>>>>>>>>>> + u32 mas2_attr; >>>>>>>>>>>> + >>>>>>>>>>>> + mas2_attr =3D mas2 & MAS2_ATTRIB_MASK; >>>>>>>>>>>> + >>>>>>>>>>>> + if (!pfn_valid(pfn)) { >>>>>>>>>>>=20 >>>>>>>>>>> Why not directly use kvm_is_mmio_pfn()? >>>>>>>>>>=20 >>>>>>>>>> What I understand from this function (someone can correct me= ) is >>>>>>>>>> that it >>>>>>>>> returns "false" when the page is managed by kernel and is not >>>>>>>>> marked as RESERVED (for some reason). For us it does not matt= er >>>>>>>>> whether the page is reserved or not, if it is kernel visible = page then it >>>>>> is DDR. >>>>>>>>>>=20 >>>>>>>>>=20 >>>>>>>>> I think you are setting I|G by addressing all mmio pages, rig= ht? If >>>>>>>>> so, >>>>>>>>>=20 >>>>>>>>> KVM: direct mmio pfn check >>>>>>>>>=20 >>>>>>>>> Userspace may specify memory slots that are backed by mm= io >>>>>>>>> pages rather than >>>>>>>>> normal RAM. In some cases it is not enough to identify = these >>>>>>>>> mmio >>>>>>> pages >>>>>>>>> by pfn_valid(). This patch adds checking the PageReserv= ed as well. >>>>>>>>=20 >>>>>>>> Do you know what are those "some cases" and how checking >>>>>>>> PageReserved helps in >>>>>>> those cases? >>>>>>>=20 >>>>>>> No, myself didn't see these actual cases in qemu,too. But this = should >>>>>>> be chronically persistent as I understand ;-) >>>>>>=20 >>>>>> Then I will wait till someone educate me :) >>>>>=20 >>>>> The reason is , kvm_is_mmio_pfn() function looks pretty heavy and= I do not want to call this for all tlbwe operation unless it is necess= ary. >>>>=20 >>>> It certainly does more than we need and potentially slows down the= fast path (RAM mapping). The only thing it does on top of "if (pfn_val= id())" is to check for pages that are declared reserved on the host. Th= is happens in 2 cases: >>>>=20 >>>> 1) Non cache coherent DMA >>>> 2) Memory hot remove >>>>=20 >>>> The non coherent DMA case would be interesting, as with the mechan= ism as it is in place in Linux today, we could potentially break normal= guest operation if we don't take it into account. However, it's Kconfi= g guarded by: >>>>=20 >>>> depends on 4xx || 8xx || E200 || PPC_MPC512x || GAMECUBE_C= OMMON >>>> default n if PPC_47x >>>> default y >>>>=20 >>>> so we never hit it with any core we care about ;). >>>>=20 >>>> Memory hot remove does not exist on e500 FWIW, so we don't have to= worry about that one either. >>>=20 >>> Thanks for this good information :) >>>=20 >>> So why not limit those codes with CONFIG_MEMORY_HOTPLUG inside kvm_= is_mmio_pfn() to make sure that check is only valid when that is really= needed? This can decrease those unnecessary performance loss. >>>=20 >>> If I'm wrong please correct me :) >>=20 >> You're perfectly right, but this is generic KVM code. So it gets run= across all architectures. What if someone has the great idea to add a = new case here for x86, but doesn't tell us? In that case we potentially= break x86. >>=20 >> I'd rather not like to break x86 :). >>=20 >> However, it'd be very interesting to see a benchmark with this chang= e. Do you think you could just rip out the whole reserved check and run= a few benchmarks and show us the results? >>=20 >=20 > Often what case should be adopted to validate this scenario? Something which hammers the TLB emulation heavily. I usually just run /= bin/echo a thousand times in "time" and see how long it takes ;) Alex From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alexander Graf Date: Thu, 18 Jul 2013 10:27:49 +0000 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages Message-Id: List-Id: References: <1374127456-9614-1-git-send-email-Bharat.Bhushan@freescale.com> <1374127456-9614-2-git-send-email-Bharat.Bhushan@freescale.com> <51E78A88.4000904@windriver.com> <6A3DF150A5B70D4F9B66A25E3F7C888D070D6AFF@039-SN2MPN1-013.039d.mgd.msft.net> <51E799D0.1030406@windriver.com> <6A3DF150A5B70D4F9B66A25E3F7C888D070D6DAB@039-SN2MPN1-013.039d.mgd.msft.net> <51E7A5A2.5040107@windriver.com> <6A3DF150A5B70D4F9B66A25E3F7C888D070D6E79@039-SN2MPN1-013.039d.mgd.msft.net> <51E7BE90.60300@windriver.com> <50683BE9-8BDD-4730-B866-F972BCF1EDD6@suse.de> <51E7C14C.6060600@windriver.com> In-Reply-To: <51E7C14C.6060600@windriver.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit To: =?windows-1252?Q?=22=93tiejun=2Echen=94=22?= Cc: Bhushan Bharat-R65777 , "kvm-ppc@vger.kernel.org" , "kvm@vger.kernel.org" , Wood Scott-B07421 On 18.07.2013, at 12:19, “tiejun.chen” wrote: > On 07/18/2013 06:12 PM, Alexander Graf wrote: >> >> On 18.07.2013, at 12:08, “tiejun.chen” wrote: >> >>> On 07/18/2013 05:48 PM, Alexander Graf wrote: >>>> >>>> On 18.07.2013, at 10:25, Bhushan Bharat-R65777 wrote: >>>> >>>>> >>>>> >>>>>> -----Original Message----- >>>>>> From: Bhushan Bharat-R65777 >>>>>> Sent: Thursday, July 18, 2013 1:53 PM >>>>>> To: '"�tiejun.chen�"' >>>>>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; agraf@suse.de; Wood Scott- >>>>>> B07421 >>>>>> Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel >>>>>> managed pages >>>>>> >>>>>> >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: "�tiejun.chen�" [mailto:tiejun.chen@windriver.com] >>>>>>> Sent: Thursday, July 18, 2013 1:52 PM >>>>>>> To: Bhushan Bharat-R65777 >>>>>>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; agraf@suse.de; Wood >>>>>>> Scott- >>>>>>> B07421 >>>>>>> Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for >>>>>>> kernel managed pages >>>>>>> >>>>>>> On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: >>>>>>>> >>>>>>>> >>>>>>>>> -----Original Message----- >>>>>>>>> From: kvm-ppc-owner@vger.kernel.org >>>>>>>>> [mailto:kvm-ppc-owner@vger.kernel.org] On Behalf Of "�tiejun.chen�" >>>>>>>>> Sent: Thursday, July 18, 2013 1:01 PM >>>>>>>>> To: Bhushan Bharat-R65777 >>>>>>>>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; agraf@suse.de; >>>>>>>>> Wood >>>>>>>>> Scott- >>>>>>>>> B07421 >>>>>>>>> Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for >>>>>>>>> kernel managed pages >>>>>>>>> >>>>>>>>> On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> -----Original Message----- >>>>>>>>>>> From: "�tiejun.chen�" [mailto:tiejun.chen@windriver.com] >>>>>>>>>>> Sent: Thursday, July 18, 2013 11:56 AM >>>>>>>>>>> To: Bhushan Bharat-R65777 >>>>>>>>>>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; agraf@suse.de; >>>>>>>>>>> Wood >>>>>>>>>>> Scott- B07421; Bhushan Bharat-R65777 >>>>>>>>>>> Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only >>>>>>>>>>> for kernel managed pages >>>>>>>>>>> >>>>>>>>>>> On 07/18/2013 02:04 PM, Bharat Bhushan wrote: >>>>>>>>>>>> If there is a struct page for the requested mapping then it's >>>>>>>>>>>> normal DDR and the mapping sets "M" bit (coherent, cacheable) >>>>>>>>>>>> else this is treated as I/O and we set "I + G" (cache >>>>>>>>>>>> inhibited, >>>>>>>>>>>> guarded) >>>>>>>>>>>> >>>>>>>>>>>> This helps setting proper TLB mapping for direct assigned device >>>>>>>>>>>> >>>>>>>>>>>> Signed-off-by: Bharat Bhushan >>>>>>>>>>>> --- >>>>>>>>>>>> arch/powerpc/kvm/e500_mmu_host.c | 17 ++++++++++++----- >>>>>>>>>>>> 1 files changed, 12 insertions(+), 5 deletions(-) >>>>>>>>>>>> >>>>>>>>>>>> diff --git a/arch/powerpc/kvm/e500_mmu_host.c >>>>>>>>>>>> b/arch/powerpc/kvm/e500_mmu_host.c >>>>>>>>>>>> index 1c6a9d7..089c227 100644 >>>>>>>>>>>> --- a/arch/powerpc/kvm/e500_mmu_host.c >>>>>>>>>>>> +++ b/arch/powerpc/kvm/e500_mmu_host.c >>>>>>>>>>>> @@ -64,13 +64,20 @@ static inline u32 >>>>>>>>>>>> e500_shadow_mas3_attrib(u32 mas3, int >>>>>>>>>>> usermode) >>>>>>>>>>>> return mas3; >>>>>>>>>>>> } >>>>>>>>>>>> >>>>>>>>>>>> -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int >>>>>>>>>>>> usermode) >>>>>>>>>>>> +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) >>>>>>>>>>>> { >>>>>>>>>>>> + u32 mas2_attr; >>>>>>>>>>>> + >>>>>>>>>>>> + mas2_attr = mas2 & MAS2_ATTRIB_MASK; >>>>>>>>>>>> + >>>>>>>>>>>> + if (!pfn_valid(pfn)) { >>>>>>>>>>> >>>>>>>>>>> Why not directly use kvm_is_mmio_pfn()? >>>>>>>>>> >>>>>>>>>> What I understand from this function (someone can correct me) is >>>>>>>>>> that it >>>>>>>>> returns "false" when the page is managed by kernel and is not >>>>>>>>> marked as RESERVED (for some reason). For us it does not matter >>>>>>>>> whether the page is reserved or not, if it is kernel visible page then it >>>>>> is DDR. >>>>>>>>>> >>>>>>>>> >>>>>>>>> I think you are setting I|G by addressing all mmio pages, right? If >>>>>>>>> so, >>>>>>>>> >>>>>>>>> KVM: direct mmio pfn check >>>>>>>>> >>>>>>>>> Userspace may specify memory slots that are backed by mmio >>>>>>>>> pages rather than >>>>>>>>> normal RAM. In some cases it is not enough to identify these >>>>>>>>> mmio >>>>>>> pages >>>>>>>>> by pfn_valid(). This patch adds checking the PageReserved as well. >>>>>>>> >>>>>>>> Do you know what are those "some cases" and how checking >>>>>>>> PageReserved helps in >>>>>>> those cases? >>>>>>> >>>>>>> No, myself didn't see these actual cases in qemu,too. But this should >>>>>>> be chronically persistent as I understand ;-) >>>>>> >>>>>> Then I will wait till someone educate me :) >>>>> >>>>> The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want to call this for all tlbwe operation unless it is necessary. >>>> >>>> It certainly does more than we need and potentially slows down the fast path (RAM mapping). The only thing it does on top of "if (pfn_valid())" is to check for pages that are declared reserved on the host. This happens in 2 cases: >>>> >>>> 1) Non cache coherent DMA >>>> 2) Memory hot remove >>>> >>>> The non coherent DMA case would be interesting, as with the mechanism as it is in place in Linux today, we could potentially break normal guest operation if we don't take it into account. However, it's Kconfig guarded by: >>>> >>>> depends on 4xx || 8xx || E200 || PPC_MPC512x || GAMECUBE_COMMON >>>> default n if PPC_47x >>>> default y >>>> >>>> so we never hit it with any core we care about ;). >>>> >>>> Memory hot remove does not exist on e500 FWIW, so we don't have to worry about that one either. >>> >>> Thanks for this good information :) >>> >>> So why not limit those codes with CONFIG_MEMORY_HOTPLUG inside kvm_is_mmio_pfn() to make sure that check is only valid when that is really needed? This can decrease those unnecessary performance loss. >>> >>> If I'm wrong please correct me :) >> >> You're perfectly right, but this is generic KVM code. So it gets run across all architectures. What if someone has the great idea to add a new case here for x86, but doesn't tell us? In that case we potentially break x86. >> >> I'd rather not like to break x86 :). >> >> However, it'd be very interesting to see a benchmark with this change. Do you think you could just rip out the whole reserved check and run a few benchmarks and show us the results? >> > > Often what case should be adopted to validate this scenario? Something which hammers the TLB emulation heavily. I usually just run /bin/echo a thousand times in "time" and see how long it takes ;) Alex