All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
To: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de,
	dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com,
	ebiederm@xmission.com, akpm@linux-foundation.org,
	stanislav.kinsburskii@gmail.com, corbet@lwn.net,
	linux-kernel@vger.kernel.org, kexec@lists.infradead.org,
	linux-mm@kvack.org, kys@microsoft.com, jgowans@amazon.com,
	wei.liu@kernel.org, arnd@arndb.de, gregkh@linuxfoundation.org,
	graf@amazon.de, pbonzini@redhat.com, bhe@redhat.com,
	dave.hansen@intel.com, kirill.shutemov@intel.com
Subject: [RFC PATCH v3 0/3] Introduce persistent memory pool
Date: Wed, 04 Oct 2023 15:23:09 -0700	[thread overview]
Message-ID: <169645773092.11424.7258549771090599226.stgit@skinsburskii.> (raw)

This patch introduces a memory allocator specifically tailored for
persistent memory within the kernel. The allocator maintains
kernel-specific states like DMA passthrough device states, IOMMU state, and
more across kexec.

The current implementation provides a foundation for custom solutions that
may be developed in the future. Although the design is kept concise and
straightforward to encourage discussion and feedback, it remains fully
functional.

The immediate need for the allocator is in ability to persist the kernel
pages deposited into Microsoft Hypervisor across kexec: these pages must
not be accessed by kernel when deposited, but can be withdrawn and released
back to kernel. Kexec in turn is used for servicing purposes and aimed to
minimize service downtime upon kernel upgrade in a fleet of machines.

The persistent memory pool builds upon the continuous memory allocator
(CMA) and ensures CMA state persistency across kexec by incorporating the
CMA bitmap into the memory region instead of allocation it from kernel
memory.

Persistent memory pool metadata is passed across kexec by using Flattened
Device Tree, which is added as another kexec segment for x86 architecture.

Potential applications include:

  1. Enabling various in-kernel entities to allocate persistent pages from
     a unified memory pool, obviating the need for reserving multiple
     regions.

  2. For in-kernel components that need the allocation address to be
     retained on kernel kexec, this address can be exposed to user space
     and subsequently passed through the command line.

  3. Distinct subsystems or drivers can set aside their region, allocating
     a segment for their persistent memory pool, suitable for uses such as
     file systems, key-value stores, and other applications.

Changes since v2:

  1. Device tree-related change are removed.

  2. Persistent memory pool region is marked as "reserved by kernel" in
     kexec e820 table, which indicates to the new kernel, that the pool
     must restored.

Changes since v1:

  1. Persistent memory pool is now a wrapper on top of CMA instead of being a
     new allocator.

  2. Persistent memory pool metadata doesn't belong to the pool anymore and
     is now passed via Flattened Device Tree instead over kexec to the new
     kernel.

The following series implements...

---

Stanislav Kinsburskii (3):
      x86/boot/e820: Expose kexec range update, remove and table update functions
      pmpool: Introduce persistent memory pool
      pmpool: Mark reserved range as "kernel reserved" in kexec e820 table


 arch/x86/include/asm/e820/api.h |    4 +
 arch/x86/kernel/e820.c          |   21 ++++-
 include/linux/pmpool.h          |   22 +++++
 mm/Kconfig                      |    8 ++
 mm/Makefile                     |    1 
 mm/pmpool.c                     |  159 +++++++++++++++++++++++++++++++++++++++
 6 files changed, 209 insertions(+), 6 deletions(-)
 create mode 100644 include/linux/pmpool.h
 create mode 100644 mm/pmpool.c



WARNING: multiple messages have this Message-ID (diff)
From: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
To: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de,
	dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com,
	ebiederm@xmission.com, akpm@linux-foundation.org,
	stanislav.kinsburskii@gmail.com, corbet@lwn.net,
	linux-kernel@vger.kernel.org, kexec@lists.infradead.org,
	linux-mm@kvack.org, kys@microsoft.com, jgowans@amazon.com,
	wei.liu@kernel.org, arnd@arndb.de, gregkh@linuxfoundation.org,
	graf@amazon.de, pbonzini@redhat.com, bhe@redhat.com,
	dave.hansen@intel.com, kirill.shutemov@intel.com
Subject: [RFC PATCH v3 0/3] Introduce persistent memory pool
Date: Wed, 04 Oct 2023 15:23:09 -0700	[thread overview]
Message-ID: <169645773092.11424.7258549771090599226.stgit@skinsburskii.> (raw)

This patch introduces a memory allocator specifically tailored for
persistent memory within the kernel. The allocator maintains
kernel-specific states like DMA passthrough device states, IOMMU state, and
more across kexec.

The current implementation provides a foundation for custom solutions that
may be developed in the future. Although the design is kept concise and
straightforward to encourage discussion and feedback, it remains fully
functional.

The immediate need for the allocator is in ability to persist the kernel
pages deposited into Microsoft Hypervisor across kexec: these pages must
not be accessed by kernel when deposited, but can be withdrawn and released
back to kernel. Kexec in turn is used for servicing purposes and aimed to
minimize service downtime upon kernel upgrade in a fleet of machines.

The persistent memory pool builds upon the continuous memory allocator
(CMA) and ensures CMA state persistency across kexec by incorporating the
CMA bitmap into the memory region instead of allocation it from kernel
memory.

Persistent memory pool metadata is passed across kexec by using Flattened
Device Tree, which is added as another kexec segment for x86 architecture.

Potential applications include:

  1. Enabling various in-kernel entities to allocate persistent pages from
     a unified memory pool, obviating the need for reserving multiple
     regions.

  2. For in-kernel components that need the allocation address to be
     retained on kernel kexec, this address can be exposed to user space
     and subsequently passed through the command line.

  3. Distinct subsystems or drivers can set aside their region, allocating
     a segment for their persistent memory pool, suitable for uses such as
     file systems, key-value stores, and other applications.

Changes since v2:

  1. Device tree-related change are removed.

  2. Persistent memory pool region is marked as "reserved by kernel" in
     kexec e820 table, which indicates to the new kernel, that the pool
     must restored.

Changes since v1:

  1. Persistent memory pool is now a wrapper on top of CMA instead of being a
     new allocator.

  2. Persistent memory pool metadata doesn't belong to the pool anymore and
     is now passed via Flattened Device Tree instead over kexec to the new
     kernel.

The following series implements...

---

Stanislav Kinsburskii (3):
      x86/boot/e820: Expose kexec range update, remove and table update functions
      pmpool: Introduce persistent memory pool
      pmpool: Mark reserved range as "kernel reserved" in kexec e820 table


 arch/x86/include/asm/e820/api.h |    4 +
 arch/x86/kernel/e820.c          |   21 ++++-
 include/linux/pmpool.h          |   22 +++++
 mm/Kconfig                      |    8 ++
 mm/Makefile                     |    1 
 mm/pmpool.c                     |  159 +++++++++++++++++++++++++++++++++++++++
 6 files changed, 209 insertions(+), 6 deletions(-)
 create mode 100644 include/linux/pmpool.h
 create mode 100644 mm/pmpool.c


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

             reply	other threads:[~2023-10-04 22:23 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-04 22:23 Stanislav Kinsburskii [this message]
2023-10-04 22:23 ` [RFC PATCH v3 0/3] Introduce persistent memory pool Stanislav Kinsburskii
2023-10-04 22:23 ` [RFC PATCH v3 1/3] x86/boot/e820: Expose kexec range update, remove and table update functions Stanislav Kinsburskii
2023-10-04 22:23   ` Stanislav Kinsburskii
2023-10-05  0:51   ` kernel test robot
2023-10-05 10:10   ` kernel test robot
2023-10-05 13:52   ` kernel test robot
2023-10-04 22:23 ` [RFC PATCH v3 2/3] pmpool: Introduce persistent memory pool Stanislav Kinsburskii
2023-10-04 22:23   ` Stanislav Kinsburskii
2023-10-04 22:23 ` [RFC PATCH v3 3/3] pmpool: Mark reserved range as "kernel reserved" in kexec e820 table Stanislav Kinsburskii
2023-10-04 22:23   ` Stanislav Kinsburskii
2023-10-05 15:58   ` kernel test robot
2023-11-06 15:04   ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=169645773092.11424.7258549771090599226.stgit@skinsburskii. \
    --to=skinsburskii@linux.microsoft.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=bhe@redhat.com \
    --cc=bp@alien8.de \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=ebiederm@xmission.com \
    --cc=graf@amazon.de \
    --cc=gregkh@linuxfoundation.org \
    --cc=hpa@zytor.com \
    --cc=jgowans@amazon.com \
    --cc=kexec@lists.infradead.org \
    --cc=kirill.shutemov@intel.com \
    --cc=kys@microsoft.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=stanislav.kinsburskii@gmail.com \
    --cc=tglx@linutronix.de \
    --cc=wei.liu@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.