All of lore.kernel.org
 help / color / mirror / Atom feed
From: Julien Grall <julien@xen.org>
To: Jan Beulich <jbeulich@suse.com>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>
Cc: "Andrew Cooper" <andrew.cooper3@citrix.com>,
	"Wei Liu" <wl@xen.org>, "Roger Pau Monné" <roger.pau@citrix.com>
Subject: Re: [PATCH v2 01/12] x86: introduce ioremap_wc()
Date: Thu, 27 May 2021 13:48:58 +0100	[thread overview]
Message-ID: <b8035805-4f44-18ce-f4cb-4ce1d3c594fc@xen.org> (raw)
In-Reply-To: <20abac99-609c-f4f6-1242-c79919f4c317@suse.com>

Hi Jan,

On 27/05/2021 13:30, Jan Beulich wrote:
> In order for a to-be-introduced ERMS form of memcpy() to not regress
> boot performance on certain systems when video output is active, we
> first need to arrange for avoiding further dependency on firmware
> setting up MTRRs in a way we can actually further modify. On many
> systems, due to the continuously growing amounts of installed memory,
> MTRRs get configured with at least one huge WB range, and with MMIO
> ranges below 4Gb then forced to UC via overlapping MTRRs. mtrr_add(), as
> it is today, can't deal with such a setup. Hence on such systems we
> presently leave the frame buffer mapped UC, leading to significantly
> reduced performance when using REP STOSB / REP MOVSB.
> 
> On post-PentiumII hardware (i.e. any that's capable of running 64-bit
> code), an effective memory type of WC can be achieved without MTRRs, by
> simply referencing the respective PAT entry from the PTEs. While this
> will leave the switch to ERMS forms of memset() and memcpy() with
> largely unchanged performance, the change here on its own improves
> performance on affected systems quite significantly: Measuring just the
> individual affected memcpy() invocations yielded a speedup by a factor
> of over 250 on my initial (Skylake) test system. memset() isn't getting
> improved by as much there, but still by a factor of about 20.
> 
> While adding {__,}PAGE_HYPERVISOR_WC, also add {__,}PAGE_HYPERVISOR_WT
> to, at the very least, make clear what PTE flags this memory type uses.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> v2: Mark ioremap_wc() __init.
> ---
> TBD: If the VGA range is WC in the fixed range MTRRs, reusing the low
>       1st Mb mapping (like ioremap() does) would be an option.
> 
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -5881,6 +5881,20 @@ void __iomem *ioremap(paddr_t pa, size_t
>       return (void __force __iomem *)va;
>   }
>   
> +void __iomem *__init ioremap_wc(paddr_t pa, size_t len)
> +{
> +    mfn_t mfn = _mfn(PFN_DOWN(pa));
> +    unsigned int offs = pa & (PAGE_SIZE - 1);
> +    unsigned int nr = PFN_UP(offs + len);
> +    void *va;
> +
> +    WARN_ON(page_is_ram_type(mfn_x(mfn), RAM_TYPE_CONVENTIONAL));
> +
> +    va = __vmap(&mfn, nr, 1, 1, PAGE_HYPERVISOR_WC, VMAP_DEFAULT);
> +
> +    return (void __force __iomem *)(va + offs);
> +}

Arm is already providing ioremap_wc() which is a wrapper to 
ioremap_attr(). Can this be moved to the common code to avoid duplication?

Cheers,

-- 
Julien Grall


  reply	other threads:[~2021-05-27 12:49 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-27 12:29 [PATCH v2 00/12] x86: memcpy() / memset() (non-)ERMS flavors plus fallout Jan Beulich
2021-05-27 12:30 ` [PATCH v2 01/12] x86: introduce ioremap_wc() Jan Beulich
2021-05-27 12:48   ` Julien Grall [this message]
2021-05-27 13:09     ` Jan Beulich
2021-05-27 13:30       ` Julien Grall
2021-05-27 14:57         ` Jan Beulich
2021-05-27 12:31 ` [PATCH v2 02/12] x86: re-work memset() Jan Beulich
2021-05-27 12:31 ` [PATCH v2 03/12] x86: re-work memcpy() Jan Beulich
2021-05-27 12:31 ` [PATCH v2 04/12] x86: control memset() and memcpy() inlining Jan Beulich
2021-05-27 12:32 ` [PATCH v2 05/12] x86: introduce "hot" and "cold" page clearing functions Jan Beulich
2021-05-27 12:32 ` [PATCH v2 06/12] page-alloc: make scrub_on_page() static Jan Beulich
2021-05-27 12:33 ` [PATCH v2 07/12] mm: allow page scrubbing routine(s) to be arch controlled Jan Beulich
2021-05-27 13:06   ` Julien Grall
2021-05-27 13:58     ` Jan Beulich
2021-06-03  9:39       ` Julien Grall
2021-06-04 13:23         ` Jan Beulich
2021-06-07 18:12           ` Julien Grall
2021-05-27 12:34 ` [PATCH v2 08/12] x86: move .text.kexec Jan Beulich
2022-02-18 13:34   ` Andrew Cooper
2021-05-27 12:34 ` [PATCH v2 09/12] video/vesa: unmap frame buffer when relinquishing console Jan Beulich
2022-02-18 13:36   ` Andrew Cooper
2021-05-27 12:35 ` [PATCH v2 10/12] video/vesa: drop "vesa-mtrr" command line option Jan Beulich
2021-05-27 12:35 ` [PATCH v2 11/12] video/vesa: drop "vesa-remap" " Jan Beulich
2022-02-18 13:35   ` Andrew Cooper
2021-05-27 12:36 ` [PATCH v2 12/12] video/vesa: adjust (not just) command line option handling Jan Beulich
2022-02-17 11:01 ` [PATCH RESEND v2] x86: introduce ioremap_wc() Jan Beulich
2022-02-17 14:47   ` Roger Pau Monné
2022-02-17 15:02     ` Jan Beulich
2022-02-17 15:50       ` Roger Pau Monné
2022-02-17 15:57         ` Jan Beulich
2022-02-18  9:09           ` Roger Pau Monné
2022-02-18  9:23             ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b8035805-4f44-18ce-f4cb-4ce1d3c594fc@xen.org \
    --to=julien@xen.org \
    --cc=andrew.cooper3@citrix.com \
    --cc=jbeulich@suse.com \
    --cc=roger.pau@citrix.com \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.