All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Beulich <jbeulich@suse.com>
To: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Xen-devel <xen-devel@lists.xenproject.org>,
	"Wei Liu" <wl@xen.org>, "Roger Pau Monné" <roger.pau@citrix.com>
Subject: Re: [Xen-devel] [PATCH v2 4/5] x86/boot: Simplify pagetable manipulation loops
Date: Mon, 20 Jan 2020 11:46:01 +0100	[thread overview]
Message-ID: <aa966174-1ee4-b720-30ad-b044ea703ea8@suse.com> (raw)
In-Reply-To: <20200117204223.30076-5-andrew.cooper3@citrix.com>

On 17.01.2020 21:42, Andrew Cooper wrote:
> For __page_tables_{start,end} and L3 bootmap initialisation, the logic is
> unnecesserily complicated owing to its attempt to use the LOOP instruction,
> which results in an off-by-8 memory address owing to LOOP's termination
> condition.
> 
> Rewrite both loops for improved clarity and speed.
> 
> Misc notes:
>  * TEST $IMM, MEM can't macrofuse.  The loop has 0x1200 iterations, so pull
>    the $_PAGE_PRESENT constant out into a spare register to turn the TEST into
>    its %REG, MEM form, which can macrofuse.
>  * Avoid the use of %fs-relative references.  %esi-relative is the more common
>    form in the code, and doesn't suffer an address generation overhead.
>  * Avoid LOOP.  CMP/JB isn't microcoded and faster to execute in all cases.
>  * For a 4 interation trivial loop, even compilers unroll these.  The
>    generated code size is a fraction larger, but this is init and the asm is
>    far easier to follow.
>  * Reposition the l2=>l1 bootmap construction so the asm reads in pagetable
>    level order.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>
with two remarks/questions, but leaving it up to you whether
you want to adjust the code:

> --- a/xen/arch/x86/boot/head.S
> +++ b/xen/arch/x86/boot/head.S
> @@ -662,11 +662,17 @@ trampoline_setup:
>          mov     %edx,sym_fs(boot_tsc_stamp)+4
>  
>          /* Relocate pagetables to point at Xen's current location in memory. */
> -        mov     $((__page_tables_end-__page_tables_start)/8),%ecx
> -1:      testl   $_PAGE_PRESENT,sym_fs(__page_tables_start)-8(,%ecx,8)
> +        mov     $_PAGE_PRESENT, %edx
> +        lea     sym_esi(__page_tables_start), %eax
> +        lea     sym_esi(__page_tables_end), %edi
> +
> +1:      testb   %dl, (%eax)  /* if page present */

When it's an immediate, using TESTB is generally helpful because
there's no (sign- or whatever-)extended immediate form of it.
When using a register, I think it would generally be better to
use native size, even if for register reads the partial register
access penalty may (today) be zero.

> @@ -701,22 +707,27 @@ trampoline_setup:
>          cmp     %edx, %ecx
>          jbe     1b
>  
> -        /* Initialize L3 boot-map page directory entries. */
> -        lea     __PAGE_HYPERVISOR+(L2_PAGETABLE_ENTRIES*8)*3+sym_esi(l2_bootmap),%eax
> -        mov     $4,%ecx
> -1:      mov     %eax,sym_fs(l3_bootmap)-8(,%ecx,8)
> -        sub     $(L2_PAGETABLE_ENTRIES*8),%eax
> -        loop    1b
> -
> -        /* Map the permanent trampoline page into l{1,2}_bootmap[]. */
> +        /* Map 4x l2_bootmap[] into l3_bootmap[0...3] */
> +        lea     __PAGE_HYPERVISOR + sym_esi(l2_bootmap), %eax
> +        mov     $PAGE_SIZE, %edx
> +        mov     %eax, 0  + sym_esi(l3_bootmap)
> +        add     %edx, %eax
> +        mov     %eax, 8  + sym_esi(l3_bootmap)
> +        add     %edx, %eax
> +        mov     %eax, 16 + sym_esi(l3_bootmap)
> +        add     %edx, %eax
> +        mov     %eax, 24 + sym_esi(l3_bootmap)

It took me a moment to realize the code is correct despite there
not being any mention of PAGE_SIZE between each of the MOVs. As
you don't view code size as a (primary) concern, perhaps worth
using

        add     $PAGE_SIZE, %eax

everywhere, the more that this has a special, ModR/M-less
encoding?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  reply	other threads:[~2020-01-20 10:46 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-17 20:42 [Xen-devel] [PATCH v2 0/5] x86: Remove more 16M total-size restrictions Andrew Cooper
2020-01-17 20:42 ` [Xen-devel] [PATCH v2 1/5] x86/boot: Create the l2_xenmap[] mappings dynamically Andrew Cooper
2020-01-20 10:29   ` Jan Beulich
2020-01-17 20:42 ` [Xen-devel] [PATCH v2 2/5] x86/boot: Size the boot/directmap " Andrew Cooper
2020-01-20 10:30   ` Jan Beulich
2020-01-17 20:42 ` [Xen-devel] [PATCH v2 3/5] x86/boot: Drop explicit %fs uses Andrew Cooper
2020-01-20 10:35   ` Jan Beulich
2020-01-17 20:42 ` [Xen-devel] [PATCH v2 4/5] x86/boot: Simplify pagetable manipulation loops Andrew Cooper
2020-01-20 10:46   ` Jan Beulich [this message]
2020-01-22 15:43     ` Andrew Cooper
2020-01-17 20:42 ` [Xen-devel] [PATCH v2 5/5] x86/boot: Drop sym_fs() Andrew Cooper
2020-01-20 11:39   ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aa966174-1ee4-b720-30ad-b044ea703ea8@suse.com \
    --to=jbeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=roger.pau@citrix.com \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.