All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Beulich <jbeulich@suse.com>
To: Julien Grall <julien@xen.org>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>,
	"Andrew Cooper" <andrew.cooper3@citrix.com>,
	"George Dunlap" <george.dunlap@citrix.com>,
	"Ian Jackson" <iwj@xenproject.org>,
	"Stefano Stabellini" <sstabellini@kernel.org>,
	"Wei Liu" <wl@xen.org>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>
Subject: Re: [PATCH v2 07/12] mm: allow page scrubbing routine(s) to be arch controlled
Date: Thu, 27 May 2021 15:58:35 +0200	[thread overview]
Message-ID: <e805e525-f024-8b5f-3814-0c1346a227f8@suse.com> (raw)
In-Reply-To: <b1c10ad9-2cef-031d-39c2-8d2013b3e0b5@xen.org>

On 27.05.2021 15:06, Julien Grall wrote:
> On 27/05/2021 13:33, Jan Beulich wrote:
>> Especially when dealing with large amounts of memory, memset() may not
>> be very efficient; this can be bad enough that even for debug builds a
>> custom function is warranted. We additionally want to distinguish "hot"
>> and "cold" cases.
> 
> Do you have any benchmark showing the performance improvement?

This is based on the numbers provided at
https://lists.xen.org/archives/html/xen-devel/2021-04/msg00716.html (???)
with the thread with some of the prior discussion rooted at
https://lists.xen.org/archives/html/xen-devel/2021-04/msg00425.html

I'm afraid I lack ideas on how to sensibly measure _all_ of the
effects (i.e. including the amount of disturbing of caches).

>> ---
>> The choice between hot and cold in scrub_one_page()'s callers is
>> certainly up for discussion / improvement.
> 
> To get the discussion started, can you explain how you made the decision 
> between hot/cot? This will also want to be written down in the commit 
> message.

Well, the initial trivial heuristic is "allocation for oneself" vs
"allocation for someone else, or freeing, or scrubbing", i.e. whether
it would be likely that the page will soon be accessed again (or for
the first time).

>> --- /dev/null
>> +++ b/xen/arch/x86/scrub_page.S
>> @@ -0,0 +1,41 @@
>> +        .file __FILE__
>> +
>> +#include <asm/asm_defns.h>
>> +#include <xen/page-size.h>
>> +#include <xen/scrub.h>
>> +
>> +ENTRY(scrub_page_cold)
>> +        mov     $PAGE_SIZE/32, %ecx
>> +        mov     $SCRUB_PATTERN, %rax
>> +
>> +0:      movnti  %rax,   (%rdi)
>> +        movnti  %rax,  8(%rdi)
>> +        movnti  %rax, 16(%rdi)
>> +        movnti  %rax, 24(%rdi)
>> +        add     $32, %rdi
>> +        sub     $1, %ecx
>> +        jnz     0b
>> +
>> +        sfence
>> +        ret
>> +        .type scrub_page_cold, @function
>> +        .size scrub_page_cold, . - scrub_page_cold
>> +
>> +        .macro scrub_page_stosb
>> +        mov     $PAGE_SIZE, %ecx
>> +        mov     $SCRUB_BYTE_PATTERN, %eax
>> +        rep stosb
>> +        ret
>> +        .endm
>> +
>> +        .macro scrub_page_stosq
>> +        mov     $PAGE_SIZE/8, %ecx
>> +        mov     $SCRUB_PATTERN, %rax
>> +        rep stosq
>> +        ret
>> +        .endm
>> +
>> +ENTRY(scrub_page_hot)
>> +        ALTERNATIVE scrub_page_stosq, scrub_page_stosb, X86_FEATURE_ERMS
>> +        .type scrub_page_hot, @function
>> +        .size scrub_page_hot, . - scrub_page_hot
> 
>  From the commit message, it is not clear how the implementation for 
> hot/cold was chosen. Can you outline in the commit message what are the 
> assumption for each helper?

I've added 'The goal is for accesses of "cold" pages to not
disturb caches (albeit finding a good balance between this
and the higher latency looks to be difficult).'

>> @@ -1046,12 +1051,14 @@ static struct page_info *alloc_heap_page
>>       if ( first_dirty != INVALID_DIRTY_IDX ||
>>            (scrub_debug && !(memflags & MEMF_no_scrub)) )
>>       {
>> +        bool cold = d && d != current->domain;
> 
> So the assumption is if the domain is not running, then the content is 
> not in the cache. Is that correct?

Not exactly: For one, instead of "not running" it is "is not the current
domain", i.e. there may still be vCPU-s of the domain running elsewhere.
And for the cache the question isn't so much of "is in cache", but to
avoid needlessly bringing contents into the cache when the data is
unlikely to be used again soon.

Jan


  reply	other threads:[~2021-05-27 13:58 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-27 12:29 [PATCH v2 00/12] x86: memcpy() / memset() (non-)ERMS flavors plus fallout Jan Beulich
2021-05-27 12:30 ` [PATCH v2 01/12] x86: introduce ioremap_wc() Jan Beulich
2021-05-27 12:48   ` Julien Grall
2021-05-27 13:09     ` Jan Beulich
2021-05-27 13:30       ` Julien Grall
2021-05-27 14:57         ` Jan Beulich
2021-05-27 12:31 ` [PATCH v2 02/12] x86: re-work memset() Jan Beulich
2021-05-27 12:31 ` [PATCH v2 03/12] x86: re-work memcpy() Jan Beulich
2021-05-27 12:31 ` [PATCH v2 04/12] x86: control memset() and memcpy() inlining Jan Beulich
2021-05-27 12:32 ` [PATCH v2 05/12] x86: introduce "hot" and "cold" page clearing functions Jan Beulich
2021-05-27 12:32 ` [PATCH v2 06/12] page-alloc: make scrub_on_page() static Jan Beulich
2021-05-27 12:33 ` [PATCH v2 07/12] mm: allow page scrubbing routine(s) to be arch controlled Jan Beulich
2021-05-27 13:06   ` Julien Grall
2021-05-27 13:58     ` Jan Beulich [this message]
2021-06-03  9:39       ` Julien Grall
2021-06-04 13:23         ` Jan Beulich
2021-06-07 18:12           ` Julien Grall
2021-05-27 12:34 ` [PATCH v2 08/12] x86: move .text.kexec Jan Beulich
2022-02-18 13:34   ` Andrew Cooper
2021-05-27 12:34 ` [PATCH v2 09/12] video/vesa: unmap frame buffer when relinquishing console Jan Beulich
2022-02-18 13:36   ` Andrew Cooper
2021-05-27 12:35 ` [PATCH v2 10/12] video/vesa: drop "vesa-mtrr" command line option Jan Beulich
2021-05-27 12:35 ` [PATCH v2 11/12] video/vesa: drop "vesa-remap" " Jan Beulich
2022-02-18 13:35   ` Andrew Cooper
2021-05-27 12:36 ` [PATCH v2 12/12] video/vesa: adjust (not just) command line option handling Jan Beulich
2022-02-17 11:01 ` [PATCH RESEND v2] x86: introduce ioremap_wc() Jan Beulich
2022-02-17 14:47   ` Roger Pau Monné
2022-02-17 15:02     ` Jan Beulich
2022-02-17 15:50       ` Roger Pau Monné
2022-02-17 15:57         ` Jan Beulich
2022-02-18  9:09           ` Roger Pau Monné
2022-02-18  9:23             ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e805e525-f024-8b5f-3814-0c1346a227f8@suse.com \
    --to=jbeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=george.dunlap@citrix.com \
    --cc=iwj@xenproject.org \
    --cc=julien@xen.org \
    --cc=roger.pau@citrix.com \
    --cc=sstabellini@kernel.org \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.