All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Beulich <jbeulich@suse.com>
To: "xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>
Cc: "Andrew Cooper" <andrew.cooper3@citrix.com>,
	"Wei Liu" <wl@xen.org>, "Roger Pau Monné" <roger.pau@citrix.com>
Subject: [PATCH v2 04/12] x86: control memset() and memcpy() inlining
Date: Thu, 27 May 2021 14:31:44 +0200	[thread overview]
Message-ID: <654762fa-a7e2-3121-21d5-b992e8428d0e@suse.com> (raw)
In-Reply-To: <8f56a8f4-0482-932f-96a9-c791bebb4610@suse.com>

Stop the compiler from inlining non-trivial memset() and memcpy() (for
memset() see e.g. map_vcpu_info() or kimage_load_segments() for
examples). This way we even keep the compiler from using REP STOSQ /
REP MOVSQ when we'd prefer REP STOSB / REP MOVSB (when ERMS is
available).

With gcc10 this yields a modest .text size reduction (release build) of
around 2k.

Unfortunately these options aren't understood by the clang versions I
have readily available for testing with; I'm unaware of equivalents.

Note also that using cc-option-add is not an option here, or at least I
couldn't make things work with it (in case the option was not supported
by the compiler): The embedded comma in the option looks to be getting
in the way.

Requested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: New.
---
The boundary values are of course up for discussion - I wasn't really
certain whether to use 16 or 32; I'd be less certain about using yet
larger values.

Similarly whether to permit the compiler to emit REP STOSQ / REP MOVSQ
for known size, properly aligned blocks is up for discussion.

--- a/xen/arch/x86/arch.mk
+++ b/xen/arch/x86/arch.mk
@@ -51,6 +51,9 @@ CFLAGS-$(CONFIG_INDIRECT_THUNK) += -fno-
 $(call cc-option-add,CFLAGS-stack-boundary,CC,-mpreferred-stack-boundary=3)
 export CFLAGS-stack-boundary
 
+CFLAGS += $(call cc-option,$(CC),-mmemcpy-strategy=unrolled_loop:16:noalign$(comma)libcall:-1:noalign)
+CFLAGS += $(call cc-option,$(CC),-mmemset-strategy=unrolled_loop:16:noalign$(comma)libcall:-1:noalign)
+
 ifeq ($(CONFIG_UBSAN),y)
 # Don't enable alignment sanitisation.  x86 has efficient unaligned accesses,
 # and various things (ACPI tables, hypercall pages, stubs, etc) are wont-fix.



  parent reply	other threads:[~2021-05-27 12:31 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-27 12:29 [PATCH v2 00/12] x86: memcpy() / memset() (non-)ERMS flavors plus fallout Jan Beulich
2021-05-27 12:30 ` [PATCH v2 01/12] x86: introduce ioremap_wc() Jan Beulich
2021-05-27 12:48   ` Julien Grall
2021-05-27 13:09     ` Jan Beulich
2021-05-27 13:30       ` Julien Grall
2021-05-27 14:57         ` Jan Beulich
2021-05-27 12:31 ` [PATCH v2 02/12] x86: re-work memset() Jan Beulich
2021-05-27 12:31 ` [PATCH v2 03/12] x86: re-work memcpy() Jan Beulich
2021-05-27 12:31 ` Jan Beulich [this message]
2021-05-27 12:32 ` [PATCH v2 05/12] x86: introduce "hot" and "cold" page clearing functions Jan Beulich
2021-05-27 12:32 ` [PATCH v2 06/12] page-alloc: make scrub_on_page() static Jan Beulich
2021-05-27 12:33 ` [PATCH v2 07/12] mm: allow page scrubbing routine(s) to be arch controlled Jan Beulich
2021-05-27 13:06   ` Julien Grall
2021-05-27 13:58     ` Jan Beulich
2021-06-03  9:39       ` Julien Grall
2021-06-04 13:23         ` Jan Beulich
2021-06-07 18:12           ` Julien Grall
2021-05-27 12:34 ` [PATCH v2 08/12] x86: move .text.kexec Jan Beulich
2022-02-18 13:34   ` Andrew Cooper
2021-05-27 12:34 ` [PATCH v2 09/12] video/vesa: unmap frame buffer when relinquishing console Jan Beulich
2022-02-18 13:36   ` Andrew Cooper
2021-05-27 12:35 ` [PATCH v2 10/12] video/vesa: drop "vesa-mtrr" command line option Jan Beulich
2021-05-27 12:35 ` [PATCH v2 11/12] video/vesa: drop "vesa-remap" " Jan Beulich
2022-02-18 13:35   ` Andrew Cooper
2021-05-27 12:36 ` [PATCH v2 12/12] video/vesa: adjust (not just) command line option handling Jan Beulich
2022-02-17 11:01 ` [PATCH RESEND v2] x86: introduce ioremap_wc() Jan Beulich
2022-02-17 14:47   ` Roger Pau Monné
2022-02-17 15:02     ` Jan Beulich
2022-02-17 15:50       ` Roger Pau Monné
2022-02-17 15:57         ` Jan Beulich
2022-02-18  9:09           ` Roger Pau Monné
2022-02-18  9:23             ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=654762fa-a7e2-3121-21d5-b992e8428d0e@suse.com \
    --to=jbeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=roger.pau@citrix.com \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.