From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alexander Graf Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages Date: Thu, 18 Jul 2013 12:00:42 +0200 Message-ID: References: <1374127456-9614-1-git-send-email-Bharat.Bhushan@freescale.com> <1374127456-9614-2-git-send-email-Bharat.Bhushan@freescale.com> <51E78A88.4000904@windriver.com> <6A3DF150A5B70D4F9B66A25E3F7C888D070D6AFF@039-SN2MPN1-013.039d.mgd.msft.net> <51E799D0.1030406@windriver.com> <6A3DF150A5B70D4F9B66A25E3F7C888D070D6DAB@039-SN2MPN1-013.039d.mgd.msft.net> <51E7A5A2.5040107@windriver.com> <6A3DF150A5B70D4F9B66A25E3F7C888D070D6E79@039-SN2MPN1-013.039d.mgd.msft.net> <51E7AD7F.5080004@windriver.com> <42B0AAD2-F16A-4931-9B84-284260E4B436@suse.de> <51E7BBCA.6020908@windriver.com> Mime-Version: 1.0 (Apple Message framework v1278) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Bhushan Bharat-R65777 , "kvm-ppc@vger.kernel.org" , "kvm@vger.kernel.org" , Wood Scott-B07421 To: =?windows-1252?Q?=93tiejun=2Echen=94?= Return-path: In-Reply-To: <51E7BBCA.6020908@windriver.com> Sender: kvm-ppc-owner@vger.kernel.org List-Id: kvm.vger.kernel.org On 18.07.2013, at 11:56, =E2=80=9Ctiejun.chen=E2=80=9D wrote: > On 07/18/2013 05:44 PM, Alexander Graf wrote: >>=20 >> On 18.07.2013, at 10:55, =EF=BF=BDtiejun.chen=EF=BF=BD wrote: >>=20 >>> On 07/18/2013 04:25 PM, Bhushan Bharat-R65777 wrote: >>>>=20 >>>>=20 >>>>> -----Original Message----- >>>>> From: Bhushan Bharat-R65777 >>>>> Sent: Thursday, July 18, 2013 1:53 PM >>>>> To: '"=EF=BF=BDtiejun.chen=EF=BF=BD"' >>>>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; agraf@suse.de; = Wood Scott- >>>>> B07421 >>>>> Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only f= or kernel >>>>> managed pages >>>>>=20 >>>>>=20 >>>>>=20 >>>>>> -----Original Message----- >>>>>> From: "=EF=BF=BDtiejun.chen=EF=BF=BD" [mailto:tiejun.chen@windri= ver.com] >>>>>> Sent: Thursday, July 18, 2013 1:52 PM >>>>>> To: Bhushan Bharat-R65777 >>>>>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; agraf@suse.de;= Wood >>>>>> Scott- >>>>>> B07421 >>>>>> Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only = for >>>>>> kernel managed pages >>>>>>=20 >>>>>> On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: >>>>>>>=20 >>>>>>>=20 >>>>>>>> -----Original Message----- >>>>>>>> From: kvm-ppc-owner@vger.kernel.org >>>>>>>> [mailto:kvm-ppc-owner@vger.kernel.org] On Behalf Of "=EF=BF=BD= tiejun.chen=EF=BF=BD" >>>>>>>> Sent: Thursday, July 18, 2013 1:01 PM >>>>>>>> To: Bhushan Bharat-R65777 >>>>>>>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; agraf@suse.d= e; >>>>>>>> Wood >>>>>>>> Scott- >>>>>>>> B07421 >>>>>>>> Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency onl= y for >>>>>>>> kernel managed pages >>>>>>>>=20 >>>>>>>> On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: >>>>>>>>>=20 >>>>>>>>>=20 >>>>>>>>>> -----Original Message----- >>>>>>>>>> From: "=EF=BF=BDtiejun.chen=EF=BF=BD" [mailto:tiejun.chen@wi= ndriver.com] >>>>>>>>>> Sent: Thursday, July 18, 2013 11:56 AM >>>>>>>>>> To: Bhushan Bharat-R65777 >>>>>>>>>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; agraf@suse= =2Ede; >>>>>>>>>> Wood >>>>>>>>>> Scott- B07421; Bhushan Bharat-R65777 >>>>>>>>>> Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency o= nly >>>>>>>>>> for kernel managed pages >>>>>>>>>>=20 >>>>>>>>>> On 07/18/2013 02:04 PM, Bharat Bhushan wrote: >>>>>>>>>>> If there is a struct page for the requested mapping then it= 's >>>>>>>>>>> normal DDR and the mapping sets "M" bit (coherent, cacheabl= e) >>>>>>>>>>> else this is treated as I/O and we set "I + G" (cache >>>>>>>>>>> inhibited, >>>>>>>>>>> guarded) >>>>>>>>>>>=20 >>>>>>>>>>> This helps setting proper TLB mapping for direct assigned d= evice >>>>>>>>>>>=20 >>>>>>>>>>> Signed-off-by: Bharat Bhushan >>>>>>>>>>> --- >>>>>>>>>>> arch/powerpc/kvm/e500_mmu_host.c | 17 ++++++++++++---= -- >>>>>>>>>>> 1 files changed, 12 insertions(+), 5 deletions(-) >>>>>>>>>>>=20 >>>>>>>>>>> diff --git a/arch/powerpc/kvm/e500_mmu_host.c >>>>>>>>>>> b/arch/powerpc/kvm/e500_mmu_host.c >>>>>>>>>>> index 1c6a9d7..089c227 100644 >>>>>>>>>>> --- a/arch/powerpc/kvm/e500_mmu_host.c >>>>>>>>>>> +++ b/arch/powerpc/kvm/e500_mmu_host.c >>>>>>>>>>> @@ -64,13 +64,20 @@ static inline u32 >>>>>>>>>>> e500_shadow_mas3_attrib(u32 mas3, int >>>>>>>>>> usermode) >>>>>>>>>>> return mas3; >>>>>>>>>>> } >>>>>>>>>>>=20 >>>>>>>>>>> -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int >>>>>>>>>>> usermode) >>>>>>>>>>> +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t = pfn) >>>>>>>>>>> { >>>>>>>>>>> + u32 mas2_attr; >>>>>>>>>>> + >>>>>>>>>>> + mas2_attr =3D mas2 & MAS2_ATTRIB_MASK; >>>>>>>>>>> + >>>>>>>>>>> + if (!pfn_valid(pfn)) { >>>>>>>>>>=20 >>>>>>>>>> Why not directly use kvm_is_mmio_pfn()? >>>>>>>>>=20 >>>>>>>>> What I understand from this function (someone can correct me)= is >>>>>>>>> that it >>>>>>>> returns "false" when the page is managed by kernel and is not >>>>>>>> marked as RESERVED (for some reason). For us it does not matte= r >>>>>>>> whether the page is reserved or not, if it is kernel visible p= age then it >>>>> is DDR. >>>>>>>>>=20 >>>>>>>>=20 >>>>>>>> I think you are setting I|G by addressing all mmio pages, righ= t? If >>>>>>>> so, >>>>>>>>=20 >>>>>>>> KVM: direct mmio pfn check >>>>>>>>=20 >>>>>>>> Userspace may specify memory slots that are backed by mm= io >>>>>>>> pages rather than >>>>>>>> normal RAM. In some cases it is not enough to identify = these >>>>>>>> mmio >>>>>> pages >>>>>>>> by pfn_valid(). This patch adds checking the PageReserv= ed as well. >>>>>>>=20 >>>>>>> Do you know what are those "some cases" and how checking >>>>>>> PageReserved helps in >>>>>> those cases? >>>>>>=20 >>>>>> No, myself didn't see these actual cases in qemu,too. But this s= hould >>>>>> be chronically persistent as I understand ;-) >>>>>=20 >>>>> Then I will wait till someone educate me :) >>>>=20 >>>> The reason is , kvm_is_mmio_pfn() function looks pretty heavy and = I do not want to call this for all tlbwe operation unless it is necessa= ry. >>>=20 >>> Furthermore, how to distinguish we're creating TLB entry for the de= vice assigned directly to the GS? >>=20 >> Because other devices wouldn't be available to the guest through mem= ory slots. >=20 > Yes. >=20 >>=20 >>> I think its unnecessary to always check if that is mmio's pfn since= we have more non direct assigned devices. >>=20 >> I'm not sure I understand. The shadow TLB code only knows "here is a= host virtual address". It needs to figure out whether the host physica= l address behind that is RAM (can access with cache enabled) or not (ha= s to disable cache) >>=20 >=20 > Sorry, looks I'm misleading you :-P >=20 >>> So maybe we can introduce another helper to fixup that TLB entry in= instead of this path. >>=20 >> This path does fix up the shadow (host) TLB entry :). >>=20 >=20 > I just mean whether we can have a separate path dedicated to those di= rect assigned devices, not go this common path :) I don't think it's possible to have a separate path without a certain l= evel of trust. In the current flow we don't trust anyone. We just check= every translated page whether we should enable caching or not. We could take that information from 2 other side though: 1) Memory Slot 2) Guest TLB Flags If we take it from the memory slot we would have to trust QEMU (or any = other user space) to give us the right hints. Malicious user space coul= d set invalid flags. Also we'd have to add logic to track this - which = doesn't exist today. If we take it from the guest we have to trust the guest. Malicious gues= ts could set invalid flags. Now why is setting invalid flags a problem? If I understand Scott corre= ctly, it can break the host if you access certain host devices with cac= hing enabled. But to be sure I'd say we ask him directly :). Either way, not trusting anyone is definitely the safer choice. Alex From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alexander Graf Date: Thu, 18 Jul 2013 10:00:42 +0000 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages Message-Id: List-Id: References: <1374127456-9614-1-git-send-email-Bharat.Bhushan@freescale.com> <1374127456-9614-2-git-send-email-Bharat.Bhushan@freescale.com> <51E78A88.4000904@windriver.com> <6A3DF150A5B70D4F9B66A25E3F7C888D070D6AFF@039-SN2MPN1-013.039d.mgd.msft.net> <51E799D0.1030406@windriver.com> <6A3DF150A5B70D4F9B66A25E3F7C888D070D6DAB@039-SN2MPN1-013.039d.mgd.msft.net> <51E7A5A2.5040107@windriver.com> <6A3DF150A5B70D4F9B66A25E3F7C888D070D6E79@039-SN2MPN1-013.039d.mgd.msft.net> <51E7AD7F.5080004@windriver.com> <42B0AAD2-F16A-4931-9B84-284260E4B436@suse.de> <51E7BBCA.6020908@windriver.com> In-Reply-To: <51E7BBCA.6020908@windriver.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit To: =?windows-1252?Q?=93tiejun=2Echen=94?= Cc: Bhushan Bharat-R65777 , "kvm-ppc@vger.kernel.org" , "kvm@vger.kernel.org" , Wood Scott-B07421 On 18.07.2013, at 11:56, “tiejun.chen” wrote: > On 07/18/2013 05:44 PM, Alexander Graf wrote: >> >> On 18.07.2013, at 10:55, �tiejun.chen� wrote: >> >>> On 07/18/2013 04:25 PM, Bhushan Bharat-R65777 wrote: >>>> >>>> >>>>> -----Original Message----- >>>>> From: Bhushan Bharat-R65777 >>>>> Sent: Thursday, July 18, 2013 1:53 PM >>>>> To: '"�tiejun.chen�"' >>>>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; agraf@suse.de; Wood Scott- >>>>> B07421 >>>>> Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel >>>>> managed pages >>>>> >>>>> >>>>> >>>>>> -----Original Message----- >>>>>> From: "�tiejun.chen�" [mailto:tiejun.chen@windriver.com] >>>>>> Sent: Thursday, July 18, 2013 1:52 PM >>>>>> To: Bhushan Bharat-R65777 >>>>>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; agraf@suse.de; Wood >>>>>> Scott- >>>>>> B07421 >>>>>> Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for >>>>>> kernel managed pages >>>>>> >>>>>> On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: >>>>>>> >>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: kvm-ppc-owner@vger.kernel.org >>>>>>>> [mailto:kvm-ppc-owner@vger.kernel.org] On Behalf Of "�tiejun.chen�" >>>>>>>> Sent: Thursday, July 18, 2013 1:01 PM >>>>>>>> To: Bhushan Bharat-R65777 >>>>>>>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; agraf@suse.de; >>>>>>>> Wood >>>>>>>> Scott- >>>>>>>> B07421 >>>>>>>> Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for >>>>>>>> kernel managed pages >>>>>>>> >>>>>>>> On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>>> -----Original Message----- >>>>>>>>>> From: "�tiejun.chen�" [mailto:tiejun.chen@windriver.com] >>>>>>>>>> Sent: Thursday, July 18, 2013 11:56 AM >>>>>>>>>> To: Bhushan Bharat-R65777 >>>>>>>>>> Cc: kvm-ppc@vger.kernel.org; kvm@vger.kernel.org; agraf@suse.de; >>>>>>>>>> Wood >>>>>>>>>> Scott- B07421; Bhushan Bharat-R65777 >>>>>>>>>> Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only >>>>>>>>>> for kernel managed pages >>>>>>>>>> >>>>>>>>>> On 07/18/2013 02:04 PM, Bharat Bhushan wrote: >>>>>>>>>>> If there is a struct page for the requested mapping then it's >>>>>>>>>>> normal DDR and the mapping sets "M" bit (coherent, cacheable) >>>>>>>>>>> else this is treated as I/O and we set "I + G" (cache >>>>>>>>>>> inhibited, >>>>>>>>>>> guarded) >>>>>>>>>>> >>>>>>>>>>> This helps setting proper TLB mapping for direct assigned device >>>>>>>>>>> >>>>>>>>>>> Signed-off-by: Bharat Bhushan >>>>>>>>>>> --- >>>>>>>>>>> arch/powerpc/kvm/e500_mmu_host.c | 17 ++++++++++++----- >>>>>>>>>>> 1 files changed, 12 insertions(+), 5 deletions(-) >>>>>>>>>>> >>>>>>>>>>> diff --git a/arch/powerpc/kvm/e500_mmu_host.c >>>>>>>>>>> b/arch/powerpc/kvm/e500_mmu_host.c >>>>>>>>>>> index 1c6a9d7..089c227 100644 >>>>>>>>>>> --- a/arch/powerpc/kvm/e500_mmu_host.c >>>>>>>>>>> +++ b/arch/powerpc/kvm/e500_mmu_host.c >>>>>>>>>>> @@ -64,13 +64,20 @@ static inline u32 >>>>>>>>>>> e500_shadow_mas3_attrib(u32 mas3, int >>>>>>>>>> usermode) >>>>>>>>>>> return mas3; >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int >>>>>>>>>>> usermode) >>>>>>>>>>> +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) >>>>>>>>>>> { >>>>>>>>>>> + u32 mas2_attr; >>>>>>>>>>> + >>>>>>>>>>> + mas2_attr = mas2 & MAS2_ATTRIB_MASK; >>>>>>>>>>> + >>>>>>>>>>> + if (!pfn_valid(pfn)) { >>>>>>>>>> >>>>>>>>>> Why not directly use kvm_is_mmio_pfn()? >>>>>>>>> >>>>>>>>> What I understand from this function (someone can correct me) is >>>>>>>>> that it >>>>>>>> returns "false" when the page is managed by kernel and is not >>>>>>>> marked as RESERVED (for some reason). For us it does not matter >>>>>>>> whether the page is reserved or not, if it is kernel visible page then it >>>>> is DDR. >>>>>>>>> >>>>>>>> >>>>>>>> I think you are setting I|G by addressing all mmio pages, right? If >>>>>>>> so, >>>>>>>> >>>>>>>> KVM: direct mmio pfn check >>>>>>>> >>>>>>>> Userspace may specify memory slots that are backed by mmio >>>>>>>> pages rather than >>>>>>>> normal RAM. In some cases it is not enough to identify these >>>>>>>> mmio >>>>>> pages >>>>>>>> by pfn_valid(). This patch adds checking the PageReserved as well. >>>>>>> >>>>>>> Do you know what are those "some cases" and how checking >>>>>>> PageReserved helps in >>>>>> those cases? >>>>>> >>>>>> No, myself didn't see these actual cases in qemu,too. But this should >>>>>> be chronically persistent as I understand ;-) >>>>> >>>>> Then I will wait till someone educate me :) >>>> >>>> The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want to call this for all tlbwe operation unless it is necessary. >>> >>> Furthermore, how to distinguish we're creating TLB entry for the device assigned directly to the GS? >> >> Because other devices wouldn't be available to the guest through memory slots. > > Yes. > >> >>> I think its unnecessary to always check if that is mmio's pfn since we have more non direct assigned devices. >> >> I'm not sure I understand. The shadow TLB code only knows "here is a host virtual address". It needs to figure out whether the host physical address behind that is RAM (can access with cache enabled) or not (has to disable cache) >> > > Sorry, looks I'm misleading you :-P > >>> So maybe we can introduce another helper to fixup that TLB entry in instead of this path. >> >> This path does fix up the shadow (host) TLB entry :). >> > > I just mean whether we can have a separate path dedicated to those direct assigned devices, not go this common path :) I don't think it's possible to have a separate path without a certain level of trust. In the current flow we don't trust anyone. We just check every translated page whether we should enable caching or not. We could take that information from 2 other side though: 1) Memory Slot 2) Guest TLB Flags If we take it from the memory slot we would have to trust QEMU (or any other user space) to give us the right hints. Malicious user space could set invalid flags. Also we'd have to add logic to track this - which doesn't exist today. If we take it from the guest we have to trust the guest. Malicious guests could set invalid flags. Now why is setting invalid flags a problem? If I understand Scott correctly, it can break the host if you access certain host devices with caching enabled. But to be sure I'd say we ask him directly :). Either way, not trusting anyone is definitely the safer choice. Alex