linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/memcontrol.c: drains percpu charge caches in memory.reclaim
@ 2022-11-10  6:53 Lu Jialin
  2022-11-10 14:42 ` Michal Koutný
  0 siblings, 1 reply; 7+ messages in thread
From: Lu Jialin @ 2022-11-10  6:53 UTC (permalink / raw)
  To: Johannes Weiner, Andrew Morton, Michal Hocko, Roman Gushchin,
	Shakeel Butt, Muchun Song
  Cc: Lu Jialin, cgroups, linux-kernel, linux-mm

When user use memory.reclaim to reclaim memory, after drain percpu lru
caches, drain percpu charge caches for given memcg stock in the hope
of introducing more evictable pages.

Signed-off-by: Lu Jialin <lujialin4@huawei.com>
---
 mm/memcontrol.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 2d8549ae1b30..768091cc6a9a 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -6593,10 +6593,13 @@ static ssize_t memory_reclaim(struct kernfs_open_file *of, char *buf,
 		/*
 		 * This is the final attempt, drain percpu lru caches in the
 		 * hope of introducing more evictable pages for
-		 * try_to_free_mem_cgroup_pages().
+		 * try_to_free_mem_cgroup_pages(). Also, drain all percpu
+		 * charge caches for given memcg.
 		 */
-		if (!nr_retries)
+		if (!nr_retries) {
 			lru_add_drain_all();
+			drain_all_stock(memcg);
+		}
 
 		reclaimed = try_to_free_mem_cgroup_pages(memcg,
 						nr_to_reclaim - nr_reclaimed,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/memcontrol.c: drains percpu charge caches in memory.reclaim
  2022-11-10  6:53 [PATCH] mm/memcontrol.c: drains percpu charge caches in memory.reclaim Lu Jialin
@ 2022-11-10 14:42 ` Michal Koutný
  2022-11-10 19:35   ` Yosry Ahmed
  0 siblings, 1 reply; 7+ messages in thread
From: Michal Koutný @ 2022-11-10 14:42 UTC (permalink / raw)
  To: Lu Jialin
  Cc: Johannes Weiner, Andrew Morton, Michal Hocko, Roman Gushchin,
	Shakeel Butt, Muchun Song, cgroups, linux-kernel, linux-mm

[-- Attachment #1: Type: text/plain, Size: 595 bytes --]

Hello Jialin.

On Thu, Nov 10, 2022 at 02:53:16PM +0800, Lu Jialin <lujialin4@huawei.com> wrote:
> When user use memory.reclaim to reclaim memory, after drain percpu lru
> caches, drain percpu charge caches for given memcg stock in the hope
> of introducing more evictable pages.

Do you have any data on materialization of this hope?

IIUC, the stock is useful for batched accounting to page_counter but it
doesn't represent real pages. I.e. your change may reduce the
page_counter value but it would not release any pages. Or have I missed
a way how it helps with the reclaim?

Thanks,
Michal

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/memcontrol.c: drains percpu charge caches in memory.reclaim
  2022-11-10 14:42 ` Michal Koutný
@ 2022-11-10 19:35   ` Yosry Ahmed
  2022-11-10 19:45     ` Yosry Ahmed
  2022-11-11 10:08     ` Michal Koutný
  0 siblings, 2 replies; 7+ messages in thread
From: Yosry Ahmed @ 2022-11-10 19:35 UTC (permalink / raw)
  To: Michal Koutný
  Cc: Lu Jialin, Johannes Weiner, Andrew Morton, Michal Hocko,
	Roman Gushchin, Shakeel Butt, Muchun Song, cgroups, linux-kernel,
	linux-mm

On Thu, Nov 10, 2022 at 6:42 AM Michal Koutný <mkoutny@suse.com> wrote:
>
> Hello Jialin.
>
> On Thu, Nov 10, 2022 at 02:53:16PM +0800, Lu Jialin <lujialin4@huawei.com> wrote:
> > When user use memory.reclaim to reclaim memory, after drain percpu lru
> > caches, drain percpu charge caches for given memcg stock in the hope
> > of introducing more evictable pages.
>
> Do you have any data on materialization of this hope?
>
> IIUC, the stock is useful for batched accounting to page_counter but it
> doesn't represent real pages. I.e. your change may reduce the
> page_counter value but it would not release any pages. Or have I missed
> a way how it helps with the reclaim?

+1

It looks like we just overcharge the memcg if the number of allocated
pages are less than the charging batch size, so that upcoming
allocations can go through a fast accounting path and consume from the
precharged stock. I don't understand how draining this charge may help
reclaim.

OTOH, it will reduce the page counters, so if userspace is relying on
memory.current to gauge how much reclaim they want to do, it will make
it "appear" like the usage dropped. If userspace is using other
signals (refaults, PSI, etc), then we would be more-or-less tricking
it into thinking we reclaimed pages when we actually didn't. In that
case we didn't really reclaim anything, we just dropped memory.current
slightly, which wouldn't matter to the user in this case, as other
signals won't change.

The difference in perceived usage coming from draining the stock IIUC
has an upper bound of 63 * PAGE_SIZE (< 256 KB with 4KB pages), I
wonder if this is really significant anyway.

>
> Thanks,
> Michal

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/memcontrol.c: drains percpu charge caches in memory.reclaim
  2022-11-10 19:35   ` Yosry Ahmed
@ 2022-11-10 19:45     ` Yosry Ahmed
  2022-11-11 10:08     ` Michal Koutný
  1 sibling, 0 replies; 7+ messages in thread
From: Yosry Ahmed @ 2022-11-10 19:45 UTC (permalink / raw)
  To: Michal Koutný
  Cc: Lu Jialin, Johannes Weiner, Andrew Morton, Michal Hocko,
	Roman Gushchin, Shakeel Butt, Muchun Song, cgroups, linux-kernel,
	linux-mm

On Thu, Nov 10, 2022 at 11:35 AM Yosry Ahmed <yosryahmed@google.com> wrote:
>
> On Thu, Nov 10, 2022 at 6:42 AM Michal Koutný <mkoutny@suse.com> wrote:
> >
> > Hello Jialin.
> >
> > On Thu, Nov 10, 2022 at 02:53:16PM +0800, Lu Jialin <lujialin4@huawei.com> wrote:
> > > When user use memory.reclaim to reclaim memory, after drain percpu lru
> > > caches, drain percpu charge caches for given memcg stock in the hope
> > > of introducing more evictable pages.
> >
> > Do you have any data on materialization of this hope?
> >
> > IIUC, the stock is useful for batched accounting to page_counter but it
> > doesn't represent real pages. I.e. your change may reduce the
> > page_counter value but it would not release any pages. Or have I missed
> > a way how it helps with the reclaim?
>
> +1
>
> It looks like we just overcharge the memcg if the number of allocated
> pages are less than the charging batch size, so that upcoming
> allocations can go through a fast accounting path and consume from the
> precharged stock. I don't understand how draining this charge may help
> reclaim.
>
> OTOH, it will reduce the page counters, so if userspace is relying on
> memory.current to gauge how much reclaim they want to do, it will make
> it "appear" like the usage dropped. If userspace is using other
> signals (refaults, PSI, etc), then we would be more-or-less tricking
> it into thinking we reclaimed pages when we actually didn't. In that
> case we didn't really reclaim anything, we just dropped memory.current
> slightly, which wouldn't matter to the user in this case, as other
> signals won't change.

In fact, we wouldn't be tricking anyone because this will have no
effect on the return value of memory.reclaim. We would just be causing
a side effect of very slightly reducing memory.current. Not sure if
this really helps.

>
> The difference in perceived usage coming from draining the stock IIUC
> has an upper bound of 63 * PAGE_SIZE (< 256 KB with 4KB pages), I
> wonder if this is really significant anyway.
>
> >
> > Thanks,
> > Michal

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/memcontrol.c: drains percpu charge caches in memory.reclaim
  2022-11-10 19:35   ` Yosry Ahmed
  2022-11-10 19:45     ` Yosry Ahmed
@ 2022-11-11 10:08     ` Michal Koutný
  2022-11-11 18:24       ` Yosry Ahmed
  1 sibling, 1 reply; 7+ messages in thread
From: Michal Koutný @ 2022-11-11 10:08 UTC (permalink / raw)
  To: Yosry Ahmed
  Cc: Lu Jialin, Johannes Weiner, Andrew Morton, Michal Hocko,
	Roman Gushchin, Shakeel Butt, Muchun Song, cgroups, linux-kernel,
	linux-mm

[-- Attachment #1: Type: text/plain, Size: 769 bytes --]

On Thu, Nov 10, 2022 at 11:35:34AM -0800, Yosry Ahmed <yosryahmed@google.com> wrote:
> OTOH, it will reduce the page counters, so if userspace is relying on
> memory.current to gauge how much reclaim they want to do, it will make
> it "appear" like the usage dropped.

Assuming memory.current is used to drive the proactive reclaim, then
this patch makes some sense (and is slightly better than draining upon
every memory.current read(2)).

I just think the commit message should explain the real mechanics of
this.

> The difference in perceived usage coming from draining the stock IIUC
> has an upper bound of 63 * PAGE_SIZE (< 256 KB with 4KB pages), I
> wonder if this is really significant anyway.

times nr_cpus (if memcg had stocks all over the place).

Michal

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/memcontrol.c: drains percpu charge caches in memory.reclaim
  2022-11-11 10:08     ` Michal Koutný
@ 2022-11-11 18:24       ` Yosry Ahmed
  2022-11-11 20:31         ` Johannes Weiner
  0 siblings, 1 reply; 7+ messages in thread
From: Yosry Ahmed @ 2022-11-11 18:24 UTC (permalink / raw)
  To: Michal Koutný
  Cc: Lu Jialin, Johannes Weiner, Andrew Morton, Michal Hocko,
	Roman Gushchin, Shakeel Butt, Muchun Song, cgroups, linux-kernel,
	linux-mm

On Fri, Nov 11, 2022 at 2:08 AM Michal Koutný <mkoutny@suse.com> wrote:
>
> On Thu, Nov 10, 2022 at 11:35:34AM -0800, Yosry Ahmed <yosryahmed@google.com> wrote:
> > OTOH, it will reduce the page counters, so if userspace is relying on
> > memory.current to gauge how much reclaim they want to do, it will make
> > it "appear" like the usage dropped.
>
> Assuming memory.current is used to drive the proactive reclaim, then
> this patch makes some sense (and is slightly better than draining upon
> every memory.current read(2)).

I am not sure honestly. This assumes memory.reclaim is used in
response to just memory.current, which is not true in the cases I know
about at least.

If you are using memory.reclaim merely based on memory.current, to
keep the usage below a specified number, then memory.high might be a
better fit? Unless this goal usage is a moving target maybe and you
don't want to keep changing the limits but I don't know if there are
practical use cases for this.

For us at Google, we don't really look at the current usage, but
rather on how much of the current usage we consider "cold" based on
page access bit harvesting. I suspect Meta is doing something similar
using different mechanics (PSI). I am not sure if memory.current is a
factor in either of those use cases, but maybe I am missing something
obvious.

>
> I just think the commit message should explain the real mechanics of
> this.
>
> > The difference in perceived usage coming from draining the stock IIUC
> > has an upper bound of 63 * PAGE_SIZE (< 256 KB with 4KB pages), I
> > wonder if this is really significant anyway.
>
> times nr_cpus (if memcg had stocks all over the place).

Right. In my mind I assumed the memcg would only be stocked on one cpu
for some reason.

>
> Michal

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm/memcontrol.c: drains percpu charge caches in memory.reclaim
  2022-11-11 18:24       ` Yosry Ahmed
@ 2022-11-11 20:31         ` Johannes Weiner
  0 siblings, 0 replies; 7+ messages in thread
From: Johannes Weiner @ 2022-11-11 20:31 UTC (permalink / raw)
  To: Yosry Ahmed
  Cc: Michal Koutný,
	Lu Jialin, Andrew Morton, Michal Hocko, Roman Gushchin,
	Shakeel Butt, Muchun Song, cgroups, linux-kernel, linux-mm

On Fri, Nov 11, 2022 at 10:24:02AM -0800, Yosry Ahmed wrote:
> On Fri, Nov 11, 2022 at 2:08 AM Michal Koutný <mkoutny@suse.com> wrote:
> >
> > On Thu, Nov 10, 2022 at 11:35:34AM -0800, Yosry Ahmed <yosryahmed@google.com> wrote:
> > > OTOH, it will reduce the page counters, so if userspace is relying on
> > > memory.current to gauge how much reclaim they want to do, it will make
> > > it "appear" like the usage dropped.
> >
> > Assuming memory.current is used to drive the proactive reclaim, then
> > this patch makes some sense (and is slightly better than draining upon
> > every memory.current read(2)).
> 
> I am not sure honestly. This assumes memory.reclaim is used in
> response to just memory.current, which is not true in the cases I know
> about at least.
> 
> If you are using memory.reclaim merely based on memory.current, to
> keep the usage below a specified number, then memory.high might be a
> better fit? Unless this goal usage is a moving target maybe and you
> don't want to keep changing the limits but I don't know if there are
> practical use cases for this.
> 
> For us at Google, we don't really look at the current usage, but
> rather on how much of the current usage we consider "cold" based on
> page access bit harvesting. I suspect Meta is doing something similar
> using different mechanics (PSI). I am not sure if memory.current is a
> factor in either of those use cases, but maybe I am missing something
> obvious.

Yeah, Meta drives proactive reclaim through psi feedback.

We do consult memory.current to enforce minimums, just for safety
reasons. But that's are very conservative parameter, the percpu fuzz
doesn't make much of a difference there; certainly, we haven't had any
problems with memory.reclaim not draining stocks.

So I would agree that it's not entirely obvious why stocks should be
drained as part of memory.reclaim. I'm curious what led to the patch.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-11-11 20:31 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-10  6:53 [PATCH] mm/memcontrol.c: drains percpu charge caches in memory.reclaim Lu Jialin
2022-11-10 14:42 ` Michal Koutný
2022-11-10 19:35   ` Yosry Ahmed
2022-11-10 19:45     ` Yosry Ahmed
2022-11-11 10:08     ` Michal Koutný
2022-11-11 18:24       ` Yosry Ahmed
2022-11-11 20:31         ` Johannes Weiner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).