From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tamas K Lengyel Subject: Re: [for-4.8][PATCH v2 00/23] xen/arm: Rework the P2M code to follow break-before-make sequence Date: Thu, 15 Sep 2016 11:23:03 -0600 Message-ID: References: <1473938919-31976-1-git-send-email-julien.grall@arm.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============5957175593764824501==" Return-path: In-Reply-To: <1473938919-31976-1-git-send-email-julien.grall@arm.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: Julien Grall Cc: "Edgar E. Iglesias" , Stefano Stabellini , Razvan Cojocaru , Steve Capper , Sergej Proskurin , Xen-devel , Dirk Behme , Shanker Donthineni , wei.chen@linaro.org List-Id: xen-devel@lists.xenproject.org --===============5957175593764824501== Content-Type: multipart/alternative; boundary=001a114675a82f7449053c8f1981 --001a114675a82f7449053c8f1981 Content-Type: text/plain; charset=UTF-8 On Thu, Sep 15, 2016 at 5:28 AM, Julien Grall wrote: > Hello all, > > The ARM architecture mandates the use of a break-before-make sequence when > changing translation entries if the page table is shared between multiple > CPUs whenever a valid entry is replaced by another valid entry (see D4.7.1 > in ARM DDI 0487A.j for more details). > > The current P2M code does not respect this sequence and may result to > break coherency on some processors. > > Adapting the current implementation to use break-before-make sequence would > imply some code duplication and more TLBs invalidations than necessary. > For instance, if we are replacing a 4KB page and the current mapping in > the P2M is using a 1GB superpage, the following steps will happen: > 1) Shatter the 1GB superpage into a series of 2MB superpages > 2) Shatter the 2MB superpage into a series of 4KB superpages > 3) Replace the 4KB page > > As the current implementation is shattering while descending and install > the mapping before continuing to the next level, Xen would need to issue 3 > TLB invalidation instructions which is clearly inefficient. > > Furthermore, all the operations which modify the page table are using the > same skeleton. It is more complicated to maintain different code paths than > having a generic function that set an entry and take care of the > break-before- > make sequence. > > The new implementation is based on the x86 EPT one which, I think, fits > quite well for the break-before-make sequence whilst keeping the code > simple. > > For all the changes see in each patch. > > I have provided a branch based on upstream here: > git://xenbits.xen.org/people/julieng/xen-unstable.git branch p2m-v2 > > Tested-by: Tamas K Lengyel Works without any issue on both the Cubietruck and the HiKey LeMaker with the xen-access test-cases. Cheers, Tamas --001a114675a82f7449053c8f1981 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable


On Thu, Sep 15, 2016 at 5:28 AM, Julien Grall <julien.grall@arm.com= > wrote:
Hello all,

The ARM architecture mandates the use of a break-before-make sequence when<= br> changing translation entries if the page table is shared between multiple CPUs whenever a valid entry is replaced by another valid entry (see D4.7.1<= br> in ARM DDI 0487A.j for more details).

The current P2M code does not respect this sequence and may result to
break coherency on some processors.

Adapting the current implementation to use break-before-make sequence would=
imply some code duplication and more TLBs invalidations than necessary.
For instance, if we are replacing a 4KB page and the current mapping in
the P2M is using a 1GB superpage, the following steps will happen:
=C2=A0 =C2=A0 1) Shatter the 1GB superpage into a series of 2MB superpages<= br> =C2=A0 =C2=A0 2) Shatter the 2MB superpage into a series of 4KB superpages<= br> =C2=A0 =C2=A0 3) Replace the 4KB page

As the current implementation is shattering while descending and install the mapping before continuing to the next level, Xen would need to issue 3<= br> TLB invalidation instructions which is clearly inefficient.

Furthermore, all the operations which modify the page table are using the same skeleton. It is more complicated to maintain different code paths than=
having a generic function that set an entry and take care of the break-befo= re-
make sequence.

The new implementation is based on the x86 EPT one which, I think, fits
quite well for the break-before-make sequence whilst keeping the code
simple.

For all the changes see in each patch.

I have provided a branch based on upstream here:
git://xenbits.xen.org/people/julieng/xen-= unstable.git branch p2m-v2


Tested-by: Tamas K Lengyel <tamas@tklengyel.com>

Works without a= ny issue on both the Cubietruck and the HiKey LeMaker with the xen-access t= est-cases.

Cheers,
Tamas
--001a114675a82f7449053c8f1981-- --===============5957175593764824501== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwczovL2xpc3RzLnhlbi5v cmcveGVuLWRldmVsCg== --===============5957175593764824501==--