linux-coco.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Vlastimil Babka <vbabka@suse.cz>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Borislav Petkov <bp@alien8.de>, Andy Lutomirski <luto@kernel.org>,
	Sean Christopherson <seanjc@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Joerg Roedel <jroedel@suse.de>, Ard Biesheuvel <ardb@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>,
	Kuppuswamy Sathyanarayanan
	<sathyanarayanan.kuppuswamy@linux.intel.com>,
	David Rientjes <rientjes@google.com>,
	Tom Lendacky <thomas.lendacky@amd.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Ingo Molnar <mingo@redhat.com>,
	Dario Faggioli <dfaggioli@suse.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Mike Rapoport <rppt@kernel.org>,
	marcelo.cerri@canonical.com, tim.gardner@canonical.com,
	khalid.elmously@canonical.com, philip.cox@canonical.com,
	x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev,
	linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org,
	Mike Rapoport <rppt@linux.ibm.com>,
	Mel Gorman <mgorman@techsingularity.net>
Subject: Re: [PATCHv7 02/14] mm: Add support for unaccepted memory
Date: Fri, 5 Aug 2022 16:22:45 +0200	[thread overview]
Message-ID: <e828b48f-dcd8-6404-fc30-6e1dd682252f@redhat.com> (raw)
In-Reply-To: <f936b024-43e1-5390-e33f-ad7d355a2802@suse.cz>

On 05.08.22 15:38, Vlastimil Babka wrote:
> On 8/5/22 14:09, David Hildenbrand wrote:
>> On 05.08.22 13:49, Vlastimil Babka wrote:
>>> On 6/14/22 14:02, Kirill A. Shutemov wrote:
>>>> UEFI Specification version 2.9 introduces the concept of memory
>>>> acceptance. Some Virtual Machine platforms, such as Intel TDX or AMD
>>>> SEV-SNP, require memory to be accepted before it can be used by the
>>>> guest. Accepting happens via a protocol specific to the Virtual Machine
>>>> platform.
>>>>
>>>> There are several ways kernel can deal with unaccepted memory:
>>>>
>>>>  1. Accept all the memory during the boot. It is easy to implement and
>>>>     it doesn't have runtime cost once the system is booted. The downside
>>>>     is very long boot time.
>>>>
>>>>     Accept can be parallelized to multiple CPUs to keep it manageable
>>>>     (i.e. via DEFERRED_STRUCT_PAGE_INIT), but it tends to saturate
>>>>     memory bandwidth and does not scale beyond the point.
>>>>
>>>>  2. Accept a block of memory on the first use. It requires more
>>>>     infrastructure and changes in page allocator to make it work, but
>>>>     it provides good boot time.
>>>>
>>>>     On-demand memory accept means latency spikes every time kernel steps
>>>>     onto a new memory block. The spikes will go away once workload data
>>>>     set size gets stabilized or all memory gets accepted.
>>>>
>>>>  3. Accept all memory in background. Introduce a thread (or multiple)
>>>>     that gets memory accepted proactively. It will minimize time the
>>>>     system experience latency spikes on memory allocation while keeping
>>>>     low boot time.
>>>>
>>>>     This approach cannot function on its own. It is an extension of #2:
>>>>     background memory acceptance requires functional scheduler, but the
>>>>     page allocator may need to tap into unaccepted memory before that.
>>>>
>>>>     The downside of the approach is that these threads also steal CPU
>>>>     cycles and memory bandwidth from the user's workload and may hurt
>>>>     user experience.
>>>>
>>>> Implement #2 for now. It is a reasonable default. Some workloads may
>>>> want to use #1 or #3 and they can be implemented later based on user's
>>>> demands.
>>>>
>>>> Support of unaccepted memory requires a few changes in core-mm code:
>>>>
>>>>   - memblock has to accept memory on allocation;
>>>>
>>>>   - page allocator has to accept memory on the first allocation of the
>>>>     page;
>>>>
>>>> Memblock change is trivial.
>>>>
>>>> The page allocator is modified to accept pages on the first allocation.
>>>> The new page type (encoded in the _mapcount) -- PageUnaccepted() -- is
>>>> used to indicate that the page requires acceptance.
>>>>
>>>> Architecture has to provide two helpers if it wants to support
>>>> unaccepted memory:
>>>>
>>>>  - accept_memory() makes a range of physical addresses accepted.
>>>>
>>>>  - range_contains_unaccepted_memory() checks anything within the range
>>>>    of physical addresses requires acceptance.
>>>>
>>>> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>>>> Acked-by: Mike Rapoport <rppt@linux.ibm.com>	# memblock
>>>> Reviewed-by: David Hildenbrand <david@redhat.com>
>>>
>>> Hmm I realize it's not ideal to raise this at v7, and maybe it was discussed
>>> before, but it's really not great how this affects the core page allocator
>>> paths. Wouldn't it be possible to only release pages to page allocator when
>>> accepted, and otherwise use some new per-zone variables together with the
>>> bitmap to track how much exactly is where to accept? Then it could be hooked
>>> in get_page_from_freelist() similarly to CONFIG_DEFERRED_STRUCT_PAGE_INIT -
>>> if we fail zone_watermark_fast() and there are unaccepted pages in the zone,
>>> accept them and continue. With a static key to flip in case we eventually
>>> accept everything. Because this is really similar scenario to the deferred
>>> init and that one was solved in a way that adds minimal overhead.
>>
>> I kind of like just having the memory stats being correct (e.g., free
>> memory) and acceptance being an internal detail to be triggered when
>> allocating pages -- just like the arch_alloc_page() callback.
> 
> Hm, good point about the stats. Could be tweaked perhaps so it appears
> correct on the outside, but might be tricky.
> 
>> I'm sure we could optimize for the !unaccepted memory via static keys
>> also in this version with some checks at the right places if we find
>> this to hurt performance?
> 
> It would be great if we would at least somehow hit the necessary code only
> when dealing with a >=pageblock size block. The bitmap approach and
> accepting everything smaller uprofront actually seems rather compatible. Yet
> in the current patch we e.g. check PageUnaccepted(buddy) on every buddy size
> while merging.
> 
> A list that sits besides the existing free_area, contains only >=pageblock
> order sizes of unaccepted pages (no migratetype distinguished) and we tap
> into it approximately before __rmqueue_fallback()? There would be some
> trickery around releasing zone-lock for doing accept_memory(), but should be
> manageable.
> 

Just curious, do we have a microbenchmark that is able to reveal the
impact of such code changes before we start worrying?

-- 
Thanks,

David / dhildenb


  reply	other threads:[~2022-08-05 14:22 UTC|newest]

Thread overview: 139+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-14 12:02 [PATCHv7 00/14] mm, x86/cc: Implement support for unaccepted memory Kirill A. Shutemov
2022-06-14 12:02 ` [PATCHv7 01/14] x86/boot: Centralize __pa()/__va() definitions Kirill A. Shutemov
2022-06-23 17:37   ` Dave Hansen
2022-06-14 12:02 ` [PATCHv7 02/14] mm: Add support for unaccepted memory Kirill A. Shutemov
2022-06-14 12:57   ` Gupta, Pankaj
2022-06-17 19:28   ` Tom Lendacky
2022-06-17 20:53     ` Tom Lendacky
2022-07-21 15:14   ` Borislav Petkov
2022-07-21 15:49     ` Dave Hansen
2022-07-22 19:18       ` Borislav Petkov
2022-07-22 19:30         ` Dave Hansen
2022-07-25 12:23           ` Borislav Petkov
2022-07-25 12:38             ` David Hildenbrand
2022-07-25 12:53               ` Borislav Petkov
2022-07-26 14:30                 ` David Hildenbrand
2022-07-25 13:00             ` Mike Rapoport
2022-07-25 13:05               ` Borislav Petkov
2022-08-05 11:49   ` Vlastimil Babka
2022-08-05 12:09     ` David Hildenbrand
2022-08-05 13:38       ` Vlastimil Babka
2022-08-05 14:22         ` David Hildenbrand [this message]
2022-08-05 14:53           ` Dave Hansen
2022-08-05 14:41         ` Dave Hansen
2022-08-05 18:17           ` Vlastimil Babka
2022-08-08 15:55             ` Dave Hansen
2022-08-10 14:19     ` Mel Gorman
2022-08-15 21:08       ` Dionna Amalie Glaze
2022-08-15 22:02         ` Tom Lendacky
2022-08-29 16:02           ` Dionna Amalie Glaze
2022-08-29 16:19             ` Dave Hansen
2022-09-06 17:50               ` Dionna Amalie Glaze
2022-09-08 12:11                 ` Mike Rapoport
2022-09-08 16:23                   ` Dionna Amalie Glaze
2022-09-08 19:28                     ` Mike Rapoport
2022-09-22 14:31                       ` Tom Lendacky
2022-09-24  1:03                         ` Kirill A. Shutemov
2022-09-24  9:36                           ` Mike Rapoport
2022-09-26 12:10                           ` Kirill A. Shutemov
2022-09-26 13:38                             ` Tom Lendacky
2022-09-26 15:42                               ` Kirill A. Shutemov
2022-09-26 15:42                               ` Tom Lendacky
2022-06-14 12:02 ` [PATCHv7 03/14] mm: Report unaccepted memory in meminfo Kirill A. Shutemov
2022-07-26 14:33   ` David Hildenbrand
2022-06-14 12:02 ` [PATCHv7 04/14] efi/x86: Get full memory map in allocate_e820() Kirill A. Shutemov
2022-07-25 13:02   ` Borislav Petkov
2022-06-14 12:02 ` [PATCHv7 05/14] x86/boot: Add infrastructure required for unaccepted memory support Kirill A. Shutemov
2022-06-15 10:19   ` Peter Zijlstra
2022-06-15 15:05     ` Kirill A. Shutemov
2022-07-17 17:16       ` Borislav Petkov
2022-07-25 21:33   ` Borislav Petkov
2022-06-14 12:02 ` [PATCHv7 06/14] efi/x86: Implement support for unaccepted memory Kirill A. Shutemov
2022-06-22 19:58   ` Dave Hansen
2022-07-26  8:35   ` Borislav Petkov
2022-06-14 12:02 ` [PATCHv7 07/14] x86/boot/compressed: Handle " Kirill A. Shutemov
2022-06-14 12:02 ` [PATCHv7 08/14] x86/mm: Reserve unaccepted memory bitmap Kirill A. Shutemov
2022-07-26  9:07   ` Borislav Petkov
2022-11-30  1:28     ` Kirill A. Shutemov
2022-12-01  9:37       ` Mike Rapoport
2022-12-01 13:47         ` Kirill A. Shutemov
2022-06-14 12:02 ` [PATCHv7 09/14] x86/mm: Provide helpers for unaccepted memory Kirill A. Shutemov
2022-06-14 12:02 ` [PATCHv7 10/14] x86/mm: Avoid load_unaligned_zeropad() stepping into " Kirill A. Shutemov
2022-06-23 17:19   ` Dave Hansen
2022-07-26 10:21   ` Borislav Petkov
2022-08-02 23:46     ` Dave Hansen
2022-08-03 14:02       ` Dave Hansen
2022-08-11 11:26         ` Borislav Petkov
2022-08-13 16:11           ` Andy Lutomirski
2022-08-13 21:13             ` Kirill A. Shutemov
2022-08-13 16:04         ` Andy Lutomirski
2022-08-13 20:58           ` Kirill A. Shutemov
2022-07-26 17:25   ` Borislav Petkov
2022-07-26 17:46     ` Dave Hansen
2022-07-26 20:17   ` Andy Lutomirski
2022-08-09 11:38     ` Kirill A. Shutemov
2022-08-13 16:03       ` Andy Lutomirski
2022-08-13 21:02         ` Kirill A. Shutemov
2022-06-14 12:02 ` [PATCHv7 11/14] x86: Disable kexec if system has " Kirill A. Shutemov
2022-06-23 17:23   ` Dave Hansen
2022-06-23 21:48     ` Eric W. Biederman
2022-06-24  2:00       ` Kirill A. Shutemov
2022-06-28 23:51         ` Kirill A. Shutemov
2022-06-29  0:10           ` Dave Hansen
2022-06-29  0:59             ` Kirill A. Shutemov
2022-07-04  7:18               ` Dave Young
2022-06-14 12:02 ` [PATCHv7 12/14] x86/tdx: Make _tdx_hypercall() and __tdx_module_call() available in boot stub Kirill A. Shutemov
2022-06-23 17:25   ` Dave Hansen
2022-06-14 12:02 ` [PATCHv7 13/14] x86/tdx: Refactor try_accept_one() Kirill A. Shutemov
2022-06-23 17:31   ` Dave Hansen
2022-07-26 10:58   ` Borislav Petkov
2022-06-14 12:02 ` [PATCHv7 14/14] x86/tdx: Add unaccepted memory support Kirill A. Shutemov
2022-06-24 16:22   ` Dave Hansen
2022-06-27 10:42     ` Kirill A. Shutemov
2022-07-26 14:51   ` Borislav Petkov
2022-08-09 11:45     ` Kirill A. Shutemov
2022-08-10 10:27       ` Borislav Petkov
2022-06-24 16:37 ` [PATCHv7 00/14] mm, x86/cc: Implement support for unaccepted memory Peter Gonda
2022-06-24 16:57   ` Dave Hansen
2022-06-24 17:06     ` Marc Orr
2022-06-24 17:09       ` Dave Hansen
2022-06-24 17:15         ` Peter Gonda
2022-06-24 17:19         ` Marc Orr
2022-06-24 17:21           ` Peter Gonda
2022-06-24 17:47           ` Dave Hansen
2022-06-24 18:10             ` Peter Gonda
2022-06-24 18:13               ` Dave Hansen
2022-06-24 17:40   ` Michael Roth
2022-06-24 17:58     ` Michael Roth
2022-06-24 18:05     ` Peter Gonda
2022-06-27 11:30   ` Kirill A. Shutemov
2022-06-27 11:54     ` Ard Biesheuvel
2022-06-27 12:22       ` Kirill A. Shutemov
2022-06-27 16:17         ` Peter Gonda
2022-06-27 16:33           ` Ard Biesheuvel
2022-06-27 22:38             ` Kirill A. Shutemov
2022-06-28 17:17               ` Ard Biesheuvel
2022-07-18 17:21                 ` Kirill A. Shutemov
2022-07-18 23:32                   ` Dionna Amalie Glaze
2022-07-19  0:31                     ` Dionna Amalie Glaze
2022-07-19 18:29                       ` Dionna Amalie Glaze
2022-07-19 19:13                         ` Borislav Petkov
2022-07-19 20:45                           ` Ard Biesheuvel
2022-07-19 21:23                             ` Borislav Petkov
2022-07-19 21:35                               ` Dave Hansen
2022-07-19 21:50                                 ` Borislav Petkov
2022-07-19 22:01                                   ` Kirill A. Shutemov
2022-07-19 22:02                                   ` Dave Hansen
2022-07-19 22:08                                     ` Tom Lendacky
2022-07-20  0:26                                     ` Marc Orr
2022-07-20  5:44                                       ` Borislav Petkov
2022-07-20 17:03                                         ` Marc Orr
2022-07-22 15:07                                           ` Borislav Petkov
2022-07-21 17:12                                       ` Dave Hansen
2022-07-23 11:14                                         ` Ard Biesheuvel
2022-07-28 22:01                                           ` Dionna Amalie Glaze
2022-08-09 11:14                                           ` Kirill A. Shutemov
2022-08-09 11:36                                             ` Ard Biesheuvel
2022-08-09 11:54                                               ` Kirill A. Shutemov
2022-08-09 21:09                                                 ` Dionna Amalie Glaze
2022-07-19  2:48                     ` Yao, Jiewen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e828b48f-dcd8-6404-fc30-6e1dd682252f@redhat.com \
    --to=david@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=ardb@kernel.org \
    --cc=bp@alien8.de \
    --cc=dave.hansen@intel.com \
    --cc=dfaggioli@suse.com \
    --cc=jroedel@suse.de \
    --cc=khalid.elmously@canonical.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-coco@lists.linux.dev \
    --cc=linux-efi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=marcelo.cerri@canonical.com \
    --cc=mgorman@techsingularity.net \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=philip.cox@canonical.com \
    --cc=rientjes@google.com \
    --cc=rppt@kernel.org \
    --cc=rppt@linux.ibm.com \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=tim.gardner@canonical.com \
    --cc=vbabka@suse.cz \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).