On Thu, Dec 19, 2019 at 03:07:17PM -0500, Johannes Weiner wrote: > The effective protection of any given cgroup is a somewhat complicated > construct that depends on the ancestor's configuration, siblings' > configurations, as well as current memory utilization in all these > groups. I agree with that. It makes it a bit hard to determine the equilibrium in advance. > + * Consider the following example tree: > * > + * A A/memory.low = 2G, A/memory.current = 6G > + * //\\ > + * BC DE B/memory.low = 3G B/memory.current = 2G > + * C/memory.low = 1G C/memory.current = 2G > + * D/memory.low = 0 D/memory.current = 2G > + * E/memory.low = 10G E/memory.current = 0 > * > + * and memory pressure is applied, the following memory > + * distribution is expected (approximately*): > * > + * A/memory.current = 2G > + * B/memory.current = 1.3G > + * C/memory.current = 0.6G > + * D/memory.current = 0 > + * E/memory.current = 0 > * > + * *assuming equal allocation rate and reclaimability I think the assumptions for this example don't hold (anymore). Because reclaim rate depends on the usage above protection, the siblings won't be reclaimed equally and so the low_usage proportionality will change over time and the equilibrium distribution is IMO different (I'm attaching an Octave script to calculate it). As it depends on the initial usage, I don't think there can be given such a general example (for overcommit). > @@ -6272,12 +6262,63 @@ struct cgroup_subsys memory_cgrp_subsys = { > * for next usage. This part is intentionally racy, but it's ok, > * as memory.low is a best-effort mechanism. Although it's a different issue but since this updates the docs I'm mentioning it -- we treat memory.min the same, i.e. it's subject to the same race, however, it's not meant to be best effort. I didn't look into outcomes of potential misaccounting but the comment seems to miss impact on memory.min protection. > @@ -6292,52 +6333,29 @@ enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *root, > [...] > + if (parent == root) { > + memcg->memory.emin = memcg->memory.min; > + memcg->memory.elow = memcg->memory.low; > + goto out; > } Shouldn't this condition be 'if (parent == root_mem_cgroup)'? (I.e. 1st level takes direct input, but 2nd and further levels redistribute only what they really got from parent.) Michal