Linux-mm Archive on lore.kernel.org
 help / color / Atom feed
* [LSF/MM TOPIC] Guest memory without struct page
@ 2020-02-14 21:32 Joao Martins
  0 siblings, 0 replies; only message in thread
From: Joao Martins @ 2020-02-14 21:32 UTC (permalink / raw)
  To: lsf-pc; +Cc: linux-mm

All system RAM is tracked by a metadata structure called 'struct page' which
amounts to 64bytes and represents a certain page granualarity. On x86 (or
systems which PAGE_SIZE is 4K) this data structure represents a total of 1.5%
overhead of total capacity.

For hypervisors -- specially those without vhost/PV-devices, and just VFs --
persistent/volatile memory is largely assigned to userspace without kernel
taking part in any of it's I/O paths, except for VFIO. 1.5% may not seem like
much, but it is still a total of 16G per Tb just for struct page, which is a lot
considering the hypervisor won't need it and instead should be used to create
more guests (=Happy Users).

The RFC patches submitted here [0] approach this through device-dax given the
interface it provides already for VMMs and also given that this is too a source
of overhead for non-volatile memory assigned to guests. Essentially it extends
device-dax to create a PFNMAP vma with special pages (while adding support for
huge special pages). host memory would be limited through some form of mem=X,
efi_fake_mem=Y@X:0x40000 or memmap=Y@X-1+0xefffffff i.e. dedicate Y amount for
guests memory.

Should vhost-{net,scsi,etc} be used, we copy from/to guest memory (which works
today for vhost-net, and easily adjusted for vhost-scsi), or perhaps explore
dynamically creating/freeing struct pages on GUP temporary pinning.

This topic would be to brainstorm the idea/proposal and also discuss
alternatives/pitfalls/limitations/other-usecases(*).

Regards,
  Joao

(*) To some extent there might be a similarity to '"Secret" memory userspace
APIs' subitem of this previously submitted topic[1] given that the guest memory
in the described topic isn't part of the direct map.

[0]
https://lore.kernel.org/linux-mm/20200110190313.17144-1-joao.m.martins@oracle.com/
[1] https://lore.kernel.org/linux-mm/20200206165900.GD17499@linux.ibm.com/



^ permalink raw reply	[flat|nested] only message in thread

only message in thread, back to index

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-14 21:32 [LSF/MM TOPIC] Guest memory without struct page Joao Martins

Linux-mm Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-mm/0 linux-mm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-mm linux-mm/ https://lore.kernel.org/linux-mm \
		linux-mm@kvack.org
	public-inbox-index linux-mm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kvack.linux-mm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git