On Tue, 20 Aug 2019, Shakeel Butt wrote: > On Tue, Aug 20, 2019 at 3:45 AM Michal Hocko wrote: > > On Tue 20-08-19 17:48:23, Alex Shi wrote: > > > This patchset move lru_lock into lruvec, give a lru_lock for each of > > > lruvec, thus bring a lru_lock for each of memcg. > > > > > > Per memcg lru_lock would ease the lru_lock contention a lot in > > > this patch series. > > > > > > In some data center, containers are used widely to deploy different kind > > > of services, then multiple memcgs share per node pgdat->lru_lock which > > > cause heavy lock contentions when doing lru operation. > > > > Having some real world workloads numbers would be more than useful > > for a non trivial change like this. I believe googlers have tried > > something like this in the past but then didn't have really a good > > example of workloads that benefit. I might misremember though. Cc Hugh. > > > > We, at Google, have been using per-memcg lru locks for more than 7 > years. Per-memcg lru locks are really beneficial for providing > performance isolation if there are multiple distinct jobs/memcgs > running on large machines. We are planning to upstream our internal > implementation. I will let Hugh comment on that. Thanks for the Cc Michal. As Shakeel says, Google prodkernel has been using our per-memcg lru locks for 7 years or so. Yes, we did not come up with supporting performance data at the time of posting, nor since: I see Alex has done much better on that (though I haven't even glanced to see if +s are persuasive). https://lkml.org/lkml/2012/2/20/434 was how ours was back then; some parts of that went in, then attached lrulock417.tar is how it was the last time I rebased, to v4.17. I'll set aside what I'm doing, and switch to rebasing ours to v5.3-rc and/or mmotm. Then compare with what Alex has, to see if there's any good reason to prefer one to the other: if no good reason to prefer ours, I doubt we shall bother to repost, but just use it as basis for helping to review or improve Alex's. Hugh