All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: qemu-devel@nongnu.org
Cc: "Marcel Apfelbaum" <mapfelba@redhat.com>,
	"Murilo Opsfelder Araujo" <muriloo@linux.ibm.com>,
	"Igor Kotrasinski" <i.kotrasinsk@partner.samsung.com>,
	"Eduardo Habkost" <ehabkost@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"David Hildenbrand" <david@redhat.com>,
	"Richard Henderson" <richard.henderson@linaro.org>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	"Peter Xu" <peterx@redhat.com>, "Greg Kurz" <groug@kaod.org>,
	"Stefan Hajnoczi" <stefanha@redhat.com>,
	"Igor Mammedov" <imammedo@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Philippe Mathieu-Daudé" <philmd@redhat.com>,
	"Markus Armbruster" <armbru@redhat.com>
Subject: [PATCH v4 00/14] RAM_NORESERVE, MAP_NORESERVE and hostmem "reserve" property
Date: Fri, 19 Mar 2021 11:12:16 +0100	[thread overview]
Message-ID: <20210319101230.21531-1-david@redhat.com> (raw)

Some fixes for shared anonymous memory, cleanups previously sent in other
context (resizeable allocations), followed by RAM_NORESERVE, implementing
it under POSIX using MAP_NORESERVE, and letting users configure it for
memory backens using the "reserve" property (default: true).

MAP_NORESERVE under Linux has in the context of QEMU an effect on
1) Private/shared anonymous memory
-> memory-backend-ram,id=mem0,size=10G
2) Private fd-based mappings
-> memory-backend-file,id=mem0,size=10G,mem-path=/dev/shm/0
-> memory-backend-memfd,id=mem0,size=10G
3) Private/shared hugetlb mappings
-> memory-backend-memfd,id=mem0,size=10G,hugetlb=on,hugetlbsize=2M

With MAP_NORESERVE/"reserve=off", we won't be reserving swap space (1/2) or
huge pages (3) for the whole memory region.

The target use case is virtio-mem, which dynamically exposes memory
inside a large, sparse memory area to the VM. MAP_NORESERVE tells the OS
"this mapping might be very sparse". This essentially allows
avoiding having to set "/proc/sys/vm/overcommit_memory == 1") when using
virtio-mem and also supporting hugetlbfs in the future.

virtio-mem currently only supports anonymous memory, in the future we want
to also support private memfd, shared file-based and shared hugetlbfs
mappings.

virtio-mem features I am currently working on that will make it all
play together with this work include:
1. Introducing a prealloc option for virtio-mem (e.g., using fallocate()
   when plugging blocks) to fail nicely when running out of
   backing storage like huge pages ("prealloc=on").
2. Handling virtio-mem requests via an iothread to not hold the BQL while
   populating/preallocating memory ("iothread=X").
3. Protecting unplugged memory e.g., using userfaultfd ("prot=uffd").
4. Dynamic reservation of swap space ("reserve=on")
5. Supporting resizable RAM block/memmory regions, such that we won't
   always expose a large, sparse memory region to the VM.
6. (resizeable allocations / optimized mmap handling when resizing RAM
    blocks)

v3 -> v4:
- Minor comment/description updates
- "softmmu/physmem: Fix ram_block_discard_range() to handle shared ..."
-- Extended description
- "util/mmap-alloc: Pass flags instead of separate bools to ..."
-- Move flags to include/qemu/osdep.h and rename to "QEMU_MAP_*"
- "memory: Introduce RAM_NORESERVE and wire it up in qemu_ram_mmap()"
-- Adjust to new flags. Handle errors in mmap_activate() for now.
- "util/mmap-alloc: Support RAM_NORESERVE via MAP_NORESERVE under Linux"
-- Restrict support to Linux only for now
- "qmp: Include "reserve" property of memory backends"
-- Added
- "hmp: Print "reserve" property of memory backends with ..."
-- Added

v2 -> v3:
- Renamed "softmmu/physmem: Drop "shared" parameter from ram_block_add()"
  to "softmmu/physmem: Mark shared anonymous memory RAM_SHARED" and
  adjusted the description
- Added "softmmu/physmem: Fix ram_block_discard_range() to handle shared
  anonymous memory"
- Added "softmmu/physmem: Fix qemu_ram_remap() to handle shared anonymous
  memory"
- Added "util/mmap-alloc: Pass flags instead of separate bools to
  qemu_ram_mmap()"
- "util/mmap-alloc: Support RAM_NORESERVE via MAP_NORESERVE"
-- Further tweak code comments
-- Handle shared anonymous memory

v1 -> v2:
- Rebased to upstream and phs_mem_alloc simplifications
-- Upsteam added the "map_offset" parameter to many RAM allocation
   interfaces.
- "softmmu/physmem: Drop "shared" parameter from ram_block_add()"
-- Use local variable "shared"
- "memory: introduce RAM_NORESERVE and wire it up in qemu_ram_mmap()"
-- Simplify due to phs_mem_alloc changes
- "util/mmap-alloc: Support RAM_NORESERVE via MAP_NORESERVE"
-- Add a whole bunch of comments.
-- Exclude shared anonymous memory that QEMU doesn't use
-- Special-case readonly mappings

Cc: Peter Xu <peterx@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: "Philippe Mathieu-Daudé" <philmd@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Cc: Greg Kurz <groug@kaod.org>
Cc: Liam Merwick <liam.merwick@oracle.com>
Cc: Marcel Apfelbaum <mapfelba@redhat.com>

David Hildenbrand (14):
  softmmu/physmem: Mark shared anonymous memory RAM_SHARED
  softmmu/physmem: Fix ram_block_discard_range() to handle shared
    anonymous memory
  softmmu/physmem: Fix qemu_ram_remap() to handle shared anonymous
    memory
  util/mmap-alloc: Factor out calculation of the pagesize for the guard
    page
  util/mmap-alloc: Factor out reserving of a memory region to
    mmap_reserve()
  util/mmap-alloc: Factor out activating of memory to mmap_activate()
  softmmu/memory: Pass ram_flags to qemu_ram_alloc_from_fd()
  softmmu/memory: Pass ram_flags to
    memory_region_init_ram_shared_nomigrate()
  util/mmap-alloc: Pass flags instead of separate bools to
    qemu_ram_mmap()
  memory: Introduce RAM_NORESERVE and wire it up in qemu_ram_mmap()
  util/mmap-alloc: Support RAM_NORESERVE via MAP_NORESERVE under Linux
  hostmem: Wire up RAM_NORESERVE via "reserve" property
  qmp: Include "reserve" property of memory backends
  hmp: Print "reserve" property of memory backends with "info memdev"

 backends/hostmem-file.c                       |  11 +-
 backends/hostmem-memfd.c                      |   8 +-
 backends/hostmem-ram.c                        |   7 +-
 backends/hostmem.c                            |  33 +++
 hw/core/machine-hmp-cmds.c                    |   2 +
 hw/core/machine-qmp-cmds.c                    |   1 +
 hw/m68k/next-cube.c                           |   4 +-
 hw/misc/ivshmem.c                             |   5 +-
 include/exec/cpu-common.h                     |   1 +
 include/exec/memory.h                         |  42 ++--
 include/exec/ram_addr.h                       |   9 +-
 include/qemu/mmap-alloc.h                     |  16 +-
 include/qemu/osdep.h                          |  30 ++-
 include/sysemu/hostmem.h                      |   2 +-
 migration/ram.c                               |   3 +-
 qapi/machine.json                             |   6 +
 .../memory-region-housekeeping.cocci          |   8 +-
 softmmu/memory.c                              |  27 ++-
 softmmu/physmem.c                             |  61 +++--
 util/mmap-alloc.c                             | 212 +++++++++++++-----
 util/oslib-posix.c                            |   7 +-
 util/oslib-win32.c                            |  13 +-
 22 files changed, 352 insertions(+), 156 deletions(-)

-- 
2.29.2



             reply	other threads:[~2021-03-19 10:14 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-19 10:12 David Hildenbrand [this message]
2021-03-19 10:12 ` [PATCH v4 01/14] softmmu/physmem: Mark shared anonymous memory RAM_SHARED David Hildenbrand
2021-03-19 10:12 ` [PATCH v4 02/14] softmmu/physmem: Fix ram_block_discard_range() to handle shared anonymous memory David Hildenbrand
2021-03-19 10:12 ` [PATCH v4 03/14] softmmu/physmem: Fix qemu_ram_remap() " David Hildenbrand
2021-03-23 20:40   ` Peter Xu
2021-03-19 10:12 ` [PATCH v4 04/14] util/mmap-alloc: Factor out calculation of the pagesize for the guard page David Hildenbrand
2021-03-19 10:12 ` [PATCH v4 05/14] util/mmap-alloc: Factor out reserving of a memory region to mmap_reserve() David Hildenbrand
2021-03-19 10:12 ` [PATCH v4 06/14] util/mmap-alloc: Factor out activating of memory to mmap_activate() David Hildenbrand
2021-03-19 10:12 ` [PATCH v4 07/14] softmmu/memory: Pass ram_flags to qemu_ram_alloc_from_fd() David Hildenbrand
2021-03-19 10:12 ` [PATCH v4 08/14] softmmu/memory: Pass ram_flags to memory_region_init_ram_shared_nomigrate() David Hildenbrand
2021-03-19 10:12 ` [PATCH v4 09/14] util/mmap-alloc: Pass flags instead of separate bools to qemu_ram_mmap() David Hildenbrand
2021-03-23 20:49   ` Peter Xu
2021-03-25  9:40     ` David Hildenbrand
2021-03-19 10:12 ` [PATCH v4 10/14] memory: Introduce RAM_NORESERVE and wire it up in qemu_ram_mmap() David Hildenbrand
2021-03-23 20:51   ` Peter Xu
2021-03-19 10:12 ` [PATCH v4 11/14] util/mmap-alloc: Support RAM_NORESERVE via MAP_NORESERVE under Linux David Hildenbrand
2021-03-23 20:56   ` Peter Xu
2021-03-19 10:12 ` [PATCH v4 12/14] hostmem: Wire up RAM_NORESERVE via "reserve" property David Hildenbrand
2021-03-19 10:12 ` [PATCH v4 13/14] qmp: Include "reserve" property of memory backends David Hildenbrand
2021-03-19 15:40   ` Markus Armbruster
2021-03-19 15:49     ` David Hildenbrand
2021-03-19 16:32       ` Markus Armbruster
2021-03-19 16:40         ` David Hildenbrand
2021-03-19 10:12 ` [PATCH v4 14/14] hmp: Print "reserve" property of memory backends with "info memdev" David Hildenbrand
2021-03-25 19:00   ` Dr. David Alan Gilbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210319101230.21531-1-david@redhat.com \
    --to=david@redhat.com \
    --cc=armbru@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=groug@kaod.org \
    --cc=i.kotrasinsk@partner.samsung.com \
    --cc=imammedo@redhat.com \
    --cc=mapfelba@redhat.com \
    --cc=mst@redhat.com \
    --cc=muriloo@linux.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=philmd@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.