linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [LSF/MM TOPIC] Use NVDIMM as NUMA node and NUMA API
@ 2019-01-30 17:26 Yang Shi
  0 siblings, 0 replies; only message in thread
From: Yang Shi @ 2019-01-30 17:26 UTC (permalink / raw)
  To: lsf-pc, linux-mm, linux-kernel
  Cc: mhocko, hannes, dan.j.williams, dave.hansen, fengguang.wu, YangShi

Hi folks,


I would like to attend the LSF/MM Summit 2019. I'm interested in most MM 
topics, particularly the NUMA API topic proposed by Jerome since it is 
related to my below proposal.

I would like to share some our usecases, needs and approaches about 
using NVDIMM as a NUMA node.

We would like to provide NVDIMM to our cloud customers as some low cost 
memory.  Virtual machines could run with NVDIMM as backed memory.  Then 
we would like the below needs are met:

     * The ratio of DRAM vs NVDIMM is configurable per process, or even 
per VMA
     * The user VMs alway get DRAM first as long as the ratio is not 
reached
     * Migrate cold data to NVDIMM and keep hot data in DRAM dynamically 
and throughout the life time of VMs

To meet the needs we did some in-house implementation:
     * Provide madvise interface to configure the ratio
     * Put NVDIMM into a separate zonelist so that default allocation 
can't touch it as long as it is requested explicitly
     * A kernel thread scans cold pages

We tried to just use current NUMA APIs, but we realized they can't meet 
our needs.  For example, if we configure a VMA use 50% DRAM and 50% 
NVDIMM, mbind() could set preferred node policy (DRAM node or NVDIMM 
node) for this VMA, but it can't control how much DRAM or NVDIMM is used 
by this specific VMA to satisfy the ratio.

So, IMHO we definitely need more fine-grained APIs to control the NUMA 
behavior.

I'd like also to discuss about this topic with:
     Dave Hansen
     Dan Williams
     Fengguang Wu

Other than the above topic, I'd also like to meet other MM developers to 
discuss about some our usecases about memory cgroup (hallway 
conversation may be good enough).  I had submitted some RFC patches to 
the mailing list and they did incur some discussion, but we have not 
reached solid conclusion yet.

https://lore.kernel.org/lkml/1547061285-100329-1-git-send-email-yang.shi@linux.alibaba.com/


Thanks,

Yang


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2019-01-30 17:26 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-30 17:26 [LSF/MM TOPIC] Use NVDIMM as NUMA node and NUMA API Yang Shi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).