On 30 Jun 2020, at 15:31, Dave Hansen wrote:

>
>
>> BTW is this proposal only for systems having multi-tiers of memory?
>> Can a multi-node DRAM-only system take advantage of this proposal? For
>> example I have a system with two DRAM nodes running two jobs
>> hardwalled to each node. For each job the other node is kind of
>> low-tier memory. If I can describe the per-job demotion paths then
>> these jobs can take advantage of this proposal during occasional
>> peaks.
>
> I don't see any reason it could not work there.  There would just need
> to be a way to set up a different demotion path policy that what was
> done here.

We might need a different threshold (or GFP flag) for allocating new pages
in remote node for demotion. Otherwise, we could
see scenarios like: two nodes in a system are almost full and Node A is
trying to demote some pages to Node B, which triggers page demotion from
Node B to Node A. Then, we might be able to avoid a demotion cycle by not
allowing Node A to demote pages again but swapping pages to disk when Node B
is demoting its pages to Node A, but this still leads to a long reclaim path
compared to making Node A swapping to disk directly. In such cases, Node A
should just swap pages to disk without bothering Node B at all.

Maybe something like GFP_DEMOTION flag for allocating pages for demotion and
the flag requires more free pages available in the destination node to
avoid the situation above?


—
Best Regards,
Yan Zi