On Wednesday, June 23, 2021, Hui Zhu wrote: > From: Hui Zhu > > We did some virtio-mem resize tests in high memory pressure environment. > Memory increases slowly and sometimes fails in these tests. > This is a way to reproduce the issue. > Start a qemu with a small size of memory (132Mb) and resize the > virtio-mem to hotplug memory. > Then will get following error: > [ 8.097461] virtio_mem virtio0: requested size: 0x10000000 > [ 8.098038] virtio_mem virtio0: plugging memory: 0x100000000 - > 0x107ffffff > [ 8.098829] virtio_mem virtio0: adding memory: 0x100000000 - > 0x107ffffff > [ 8.106298] kworker/0:1: vmemmap alloc failure: order:9, > mode:0x4cc0(GFP_KERNEL|__GFP_RETRY_MAYFAIL), > nodemask=(null),cpuset=/,mems_allowed=0 > [ 8.107609] CPU: 0 PID: 14 Comm: kworker/0:1 Not tainted 5.13.0-rc7+ > [ 8.108295] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), > BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 > [ 8.109476] Workqueue: events_freezable virtio_mem_run_wq > [ 8.110039] Call Trace: > [ 8.110305] dump_stack+0x76/0x94 > [ 8.110654] warn_alloc.cold+0x7b/0xdf > [ 8.111054] ? __alloc_pages+0x2c2/0x310 > [ 8.111462] vmemmap_alloc_block+0x86/0xdc > [ 8.111891] vmemmap_populate+0xfc/0x325 > [ 8.112309] __populate_section_memmap+0x38/0x4e > [ 8.112787] sparse_add_section+0x167/0x244 > [ 8.113226] __add_pages+0xa6/0x130 > [ 8.113592] add_pages+0x12/0x60 > [ 8.113934] add_memory_resource+0x114/0x2d0 > [ 8.114377] add_memory_driver_managed+0x7c/0xc0 > [ 8.114852] virtio_mem_add_memory+0x57/0xe0 > [ 8.115304] virtio_mem_sbm_plug_and_add_mb+0x9a/0x130 > [ 8.115833] virtio_mem_run_wq+0x9d5/0x1100 > I think allocating 2 Mb contiguous memory will be slow and failed > in some cases, especially in high memory pressure environment. > This commit try to add support of memory_hotplug.memmap_on_memory to > handle this issue. > > Just let SBM mode support it because memory_hotplug.memmap_on_memory > need a single memory block. Hi, I‘m on vacation this and next week. I‘ll have a closer look when I‘m back. We also want to have this optimization for BBM, initially when a big block comprises a single memory block. But we can add that separately later. > > Add nr_vmemmap_pages and sbs_vmemmap to struct sbm. > If memory_hotplug.memmap_on_memory is open, pages number of a memory > block's internal metadata will be store in nr_vmemmap_pages. > sbs_vmemmap is the number of vmemmap subblocks per Linux memory block. > The pages in the vmemmap subblocks should bigger than nr_vmemmap_pages > because sb_size need to span at least MAX_ORDER_NR_PAGES and > pageblock_nr_pages pages (virtio_mem_init). > All the pages in vmemmap subblocks is not going to add to the buddy > even if the pages that are not used to store the internal metadata > (struct pages) because they should not work reliably with > alloc_contig_range(). We most certainly want to handle partially consumed subblocks by metadata and expose that memory to the buddy. alloc_contig_range() will really only be sub-optimal on ZONE_NORMAL right now when called on pageblock granularity; so that’s when we can expect memory unplug to be less reliable, which is the case either way. ZONE_MOVABLE should be just fine I think. > > When resize virtio-mem, sbs_vmemmap is going to count in > virtio_mem_sbm_plug_and_add_mb, virtio_mem_sbm_unplug_any_sb_offline > and virtio_mem_sbm_unplug_any_sb_online. > Because internal metadata also need the real pages in the host to store > it. I think resize virtio-mem size same with the actual memory > footprint > on the host is better if we want setup a memory cgroup for QEMU. > > I did not add special module_param for this function and did not move > code > inside CONFIG_MHP_MEMMAP_ON_MEMORY. > Do I need add them? There is a single tunable to enable memmap_on_memory, so that should be sufficient I think. Thanks!