On 30 Jun 2020, at 15:31, Dave Hansen wrote: > > >> BTW is this proposal only for systems having multi-tiers of memory? >> Can a multi-node DRAM-only system take advantage of this proposal? For >> example I have a system with two DRAM nodes running two jobs >> hardwalled to each node. For each job the other node is kind of >> low-tier memory. If I can describe the per-job demotion paths then >> these jobs can take advantage of this proposal during occasional >> peaks. > > I don't see any reason it could not work there. There would just need > to be a way to set up a different demotion path policy that what was > done here. We might need a different threshold (or GFP flag) for allocating new pages in remote node for demotion. Otherwise, we could see scenarios like: two nodes in a system are almost full and Node A is trying to demote some pages to Node B, which triggers page demotion from Node B to Node A. Then, we might be able to avoid a demotion cycle by not allowing Node A to demote pages again but swapping pages to disk when Node B is demoting its pages to Node A, but this still leads to a long reclaim path compared to making Node A swapping to disk directly. In such cases, Node A should just swap pages to disk without bothering Node B at all. Maybe something like GFP_DEMOTION flag for allocating pages for demotion and the flag requires more free pages available in the destination node to avoid the situation above? — Best Regards, Yan Zi