All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [RFC] mm: memcg: add priority for soft limit reclaiming
@ 2019-09-24  7:36 Hillf Danton
  2019-09-24 13:30 ` Michal Hocko
  0 siblings, 1 reply; 9+ messages in thread
From: Hillf Danton @ 2019-09-24  7:36 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Hillf Danton, Johannes Weiner, Andrew Morton, linux, linux-mm,
	Shakeel Butt, Roman Gushchin, Matthew Wilcox


On Mon, 23 Sep 2019 21:28:34 Michal Hocko wrote:
> 
> On Mon 23-09-19 21:04:59, Hillf Danton wrote:
> >
> > On Thu, 19 Sep 2019 21:32:31 +0800 Michal Hocko wrote:
> > >
> > > On Thu 19-09-19 21:13:32, Hillf Danton wrote:
> > > >
> > > > Currently memory controler is playing increasingly important role in
> > > > how memory is used and how pages are reclaimed on memory pressure.
> > > >
> > > > In daily works memcg is often created for critical tasks and their pre
> > > > configured memory usage is supposed to be met even on memory pressure.
> > > > Administrator wants to make it configurable that the pages consumed by
> > > > memcg-B can be reclaimed by page allocations invoked not by memcg-A but
> > > > by memcg-C.
> > >
> > > I am not really sure I understand the usecase well but this sounds like
> > > what memory reclaim protection in v2 is aiming at.
> > >
> Please describe the usecase.
> 
It is for quite a while that task-A has been able to preempt task-B for
cpu cycles. IOW the physical resource cpu cycles are preemptible.

Are physical pages are preemptible too in the same manner?
Nope without priority defined for pages currently (say the link between
page->nice and task->nice).

The slrp is added for memcg instead of nice because 1) it is only used
in the page reclaiming context (in memcg it is soft limit reclaiming),
and 2) it is difficult to compare reclaimer and reclaimee task->nice
directly in that context as only info about reclaimer and lru page is
available.

Here task->nice is replaced with memcg->slrp in order to do page
preemption, PP. There is no way for task-A to PP task-B, but the
group containing task-A can PP the group containing task-B.
That preemption needs code within 100 lines as you see on top of
the current memory controller framework.

The user visible things following PP include
1) the increase in system-wide configurability,

Combined with and/or in parallel to memcg.high, PP help admin configure
and maintain 100 mm groups on systems with 100GB RAM. With every group
high bundary set to 10MB, then he only needs to fiddle with the slrps of
handful of groups containing critical tasks.

2) the increase in system-wide responsibility,

Because critical groups can be configured to be not page preempted.

3) the gradient field grows in a long running system with prioirty,

Just like the rivers going through all the ways from mountains to
the seas.

Adding PP in background reclaiming is on the way:
1> define page->nice and link it to task->nice
2> on isolating lru pages check reclaimer->nice against page->nice
   and skip page if reclaimer is lower on priority

> > A tipoint to the v2 stuff please.
> 
> Documentation/admin-guide/cgroup-v2.rst
> 
Thanks Michal.

Out of surprise slrp happened to go with the line of cgroup-v2.

--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -1108,6 +1108,17 @@ PAGE_SIZE multiple when read back.
        Going over the high limit never invokes the OOM killer and
        under extreme conditions the limit may be breached.

+  memory.slrp
+       A read-write single value [0-32] file which exists on non-root
+       cgroups.  The default is "0".
+
+       Soft limit reclaiming priority.  This is the mechanism to control
+       how physical pages are reclaimed when a group's memory usage goes
+       over its high boundary.
+
+       It makes sure that no pages will be reclaimed from any group of
+       higher slrp in favor of a lower-slrp group.
+
   memory.max
        A read-write single value file which exists on non-root
        cgroups.  The default is "max".
--

Hillf



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] mm: memcg: add priority for soft limit reclaiming
  2019-09-24  7:36 [RFC] mm: memcg: add priority for soft limit reclaiming Hillf Danton
@ 2019-09-24 13:30 ` Michal Hocko
  2019-09-24 17:23   ` Roman Gushchin
  2019-09-25  2:35   ` Hillf Danton
  0 siblings, 2 replies; 9+ messages in thread
From: Michal Hocko @ 2019-09-24 13:30 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Johannes Weiner, Andrew Morton, linux, linux-mm, Shakeel Butt,
	Roman Gushchin, Matthew Wilcox

On Tue 24-09-19 15:36:42, Hillf Danton wrote:
> 
> On Mon, 23 Sep 2019 21:28:34 Michal Hocko wrote:
> > 
> > On Mon 23-09-19 21:04:59, Hillf Danton wrote:
> > >
> > > On Thu, 19 Sep 2019 21:32:31 +0800 Michal Hocko wrote:
> > > >
> > > > On Thu 19-09-19 21:13:32, Hillf Danton wrote:
> > > > >
> > > > > Currently memory controler is playing increasingly important role in
> > > > > how memory is used and how pages are reclaimed on memory pressure.
> > > > >
> > > > > In daily works memcg is often created for critical tasks and their pre
> > > > > configured memory usage is supposed to be met even on memory pressure.
> > > > > Administrator wants to make it configurable that the pages consumed by
> > > > > memcg-B can be reclaimed by page allocations invoked not by memcg-A but
> > > > > by memcg-C.
> > > >
> > > > I am not really sure I understand the usecase well but this sounds like
> > > > what memory reclaim protection in v2 is aiming at.
> > > >
> > Please describe the usecase.
> > 
> It is for quite a while that task-A has been able to preempt task-B for
> cpu cycles. IOW the physical resource cpu cycles are preemptible.
> 
> Are physical pages are preemptible too in the same manner?
> Nope without priority defined for pages currently (say the link between
> page->nice and task->nice).
> 
> The slrp is added for memcg instead of nice because 1) it is only used
> in the page reclaiming context (in memcg it is soft limit reclaiming),
> and 2) it is difficult to compare reclaimer and reclaimee task->nice
> directly in that context as only info about reclaimer and lru page is
> available.
> 
> Here task->nice is replaced with memcg->slrp in order to do page
> preemption, PP. There is no way for task-A to PP task-B, but the
> group containing task-A can PP the group containing task-B.
> That preemption needs code within 100 lines as you see on top of
> the current memory controller framework.

This is exactly what the reclaim protection in memcg v2 is meant to be
used for. Also soft limit reclaim is absolutely terrible to achieve that
because it is just too gross to result in any smooth experience (just
have a look how it is doing priority 0 scannig!).

I am not going to even go further wrt the implementation because I
belive the priority is even semantically broken wrt hierarchical
behavior.

But really, make sure you look into the existing feature set that memcg
v2 provides already and come back if you find it unsuitable and we can
move from there. Soft limit reclaim is dead and we should let it RIP.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] mm: memcg: add priority for soft limit reclaiming
  2019-09-24 13:30 ` Michal Hocko
@ 2019-09-24 17:23   ` Roman Gushchin
  2019-09-25  2:35   ` Hillf Danton
  1 sibling, 0 replies; 9+ messages in thread
From: Roman Gushchin @ 2019-09-24 17:23 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Hillf Danton, Johannes Weiner, Andrew Morton, linux, linux-mm,
	Shakeel Butt, Matthew Wilcox

On Tue, Sep 24, 2019 at 03:30:16PM +0200, Michal Hocko wrote:
> On Tue 24-09-19 15:36:42, Hillf Danton wrote:
> > 
> > On Mon, 23 Sep 2019 21:28:34 Michal Hocko wrote:
> > > 
> > > On Mon 23-09-19 21:04:59, Hillf Danton wrote:
> > > >
> > > > On Thu, 19 Sep 2019 21:32:31 +0800 Michal Hocko wrote:
> > > > >
> > > > > On Thu 19-09-19 21:13:32, Hillf Danton wrote:
> > > > > >
> > > > > > Currently memory controler is playing increasingly important role in
> > > > > > how memory is used and how pages are reclaimed on memory pressure.
> > > > > >
> > > > > > In daily works memcg is often created for critical tasks and their pre
> > > > > > configured memory usage is supposed to be met even on memory pressure.
> > > > > > Administrator wants to make it configurable that the pages consumed by
> > > > > > memcg-B can be reclaimed by page allocations invoked not by memcg-A but
> > > > > > by memcg-C.
> > > > >
> > > > > I am not really sure I understand the usecase well but this sounds like
> > > > > what memory reclaim protection in v2 is aiming at.
> > > > >
> > > Please describe the usecase.
> > > 
> > It is for quite a while that task-A has been able to preempt task-B for
> > cpu cycles. IOW the physical resource cpu cycles are preemptible.
> > 
> > Are physical pages are preemptible too in the same manner?
> > Nope without priority defined for pages currently (say the link between
> > page->nice and task->nice).
> > 
> > The slrp is added for memcg instead of nice because 1) it is only used
> > in the page reclaiming context (in memcg it is soft limit reclaiming),
> > and 2) it is difficult to compare reclaimer and reclaimee task->nice
> > directly in that context as only info about reclaimer and lru page is
> > available.
> > 
> > Here task->nice is replaced with memcg->slrp in order to do page
> > preemption, PP. There is no way for task-A to PP task-B, but the
> > group containing task-A can PP the group containing task-B.
> > That preemption needs code within 100 lines as you see on top of
> > the current memory controller framework.
> 
> This is exactly what the reclaim protection in memcg v2 is meant to be
> used for. Also soft limit reclaim is absolutely terrible to achieve that
> because it is just too gross to result in any smooth experience (just
> have a look how it is doing priority 0 scannig!).
> 
> I am not going to even go further wrt the implementation because I
> belive the priority is even semantically broken wrt hierarchical
> behavior.
> 
> But really, make sure you look into the existing feature set that memcg
> v2 provides already and come back if you find it unsuitable and we can
> move from there. Soft limit reclaim is dead and we should let it RIP.

Can't agree more here.

Cgroup v2 memory protection mechanisms (memory.low/min) should perfectly
solve the described problem. If not, let's fix them rather than extend soft
reclaim which is already dead.

Thanks!

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] mm: memcg: add priority for soft limit reclaiming
  2019-09-24 13:30 ` Michal Hocko
  2019-09-24 17:23   ` Roman Gushchin
@ 2019-09-25  2:35   ` Hillf Danton
  2019-09-25  6:52     ` Michal Hocko
  1 sibling, 1 reply; 9+ messages in thread
From: Hillf Danton @ 2019-09-25  2:35 UTC (permalink / raw)
  To: Roman Gushchin, Michal Hocko
  Cc: Hillf Danton, Johannes Weiner, Andrew Morton, linux, linux-mm,
	Shakeel Butt, Matthew Wilcox


On Tue, 24 Sep 2019 17:23:35 +0000 from Roman Gushchin
> 
> On Tue, Sep 24, 2019 at 03:30:16PM +0200, Michal Hocko wrote:
> >
> > But really, make sure you look into the existing feature set that memcg
> > v2 provides already and come back if you find it unsuitable and we can
> > move from there. Soft limit reclaim is dead and we should let it RIP.
> 
> Can't agree more here.
> 
> Cgroup v2 memory protection mechanisms (memory.low/min) should perfectly
> solve the described problem. If not, let's fix them rather than extend soft
> reclaim which is already dead.
> 
Hehe, IIUC memory.low/min is essentially drawing a line that reclaimers
would try their best not to cross. Page preemption OTOH is near ten miles
away from that line though it is now on the shoulder of soft reclaiming.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] mm: memcg: add priority for soft limit reclaiming
  2019-09-25  2:35   ` Hillf Danton
@ 2019-09-25  6:52     ` Michal Hocko
  0 siblings, 0 replies; 9+ messages in thread
From: Michal Hocko @ 2019-09-25  6:52 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Roman Gushchin, Johannes Weiner, Andrew Morton, linux, linux-mm,
	Shakeel Butt, Matthew Wilcox

On Wed 25-09-19 10:35:30, Hillf Danton wrote:
> 
> On Tue, 24 Sep 2019 17:23:35 +0000 from Roman Gushchin
> > 
> > On Tue, Sep 24, 2019 at 03:30:16PM +0200, Michal Hocko wrote:
> > >
> > > But really, make sure you look into the existing feature set that memcg
> > > v2 provides already and come back if you find it unsuitable and we can
> > > move from there. Soft limit reclaim is dead and we should let it RIP.
> > 
> > Can't agree more here.
> > 
> > Cgroup v2 memory protection mechanisms (memory.low/min) should perfectly
> > solve the described problem. If not, let's fix them rather than extend soft
> > reclaim which is already dead.
> > 
> Hehe, IIUC memory.low/min is essentially drawing a line that reclaimers
> would try their best not to cross. Page preemption OTOH is near ten miles
> away from that line though it is now on the shoulder of soft reclaiming.

Dynamic low limit tuning would achieve exactly what you are after - aka
prioritizing some memory consumers over others.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] mm: memcg: add priority for soft limit reclaiming
  2019-09-23 13:04   ` Hillf Danton
@ 2019-09-23 13:28     ` Michal Hocko
  0 siblings, 0 replies; 9+ messages in thread
From: Michal Hocko @ 2019-09-23 13:28 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Johannes Weiner, Andrew Morton, linux-kernel, linux-mm,
	Shakeel Butt, Roman Gushchin, Matthew Wilcox

On Mon 23-09-19 21:04:59, Hillf Danton wrote:
> 
> On Thu, 19 Sep 2019 21:32:31 +0800 Michal Hocko wrote:
> > 
> > On Thu 19-09-19 21:13:32, Hillf Danton wrote:
> > >
> > > Currently memory controler is playing increasingly important role in
> > > how memory is used and how pages are reclaimed on memory pressure.
> > >
> > > In daily works memcg is often created for critical tasks and their pre
> > > configured memory usage is supposed to be met even on memory pressure.
> > > Administrator wants to make it configurable that the pages consumed by
> > > memcg-B can be reclaimed by page allocations invoked not by memcg-A but
> > > by memcg-C.
> > 
> > I am not really sure I understand the usecase well but this sounds like
> > what memory reclaim protection in v2 is aiming at.
> > 

Please describe the usecase. 

> A tipoint to the v2 stuff please.

Documentation/admin-guide/cgroup-v2.rst
 
> > > That configurability is addressed by adding priority for soft limit
> > > reclaiming to make sure that no pages will be reclaimed from memcg of
> > > higer priortiy in favor of memcg of lower priority.
> > 
> > cgroup v1 interfaces are generally frozen and mostly aimed at backward
> > compatibility. I am especially concerned about adding a new way to
> > control soft limit which is known to be misdesigned and unfixable to
> > behave reasonably.
> >
> An URL to the drafts/works about the new way in your git tree.

Whut?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] mm: memcg: add priority for soft limit reclaiming
@ 2019-09-23 13:04   ` Hillf Danton
  2019-09-23 13:28     ` Michal Hocko
  0 siblings, 1 reply; 9+ messages in thread
From: Hillf Danton @ 2019-09-23 13:04 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Hillf Danton, Johannes Weiner, Andrew Morton, linux-kernel,
	linux-mm, Shakeel Butt, Roman Gushchin, Matthew Wilcox


On Thu, 19 Sep 2019 21:32:31 +0800 Michal Hocko wrote:
> 
> On Thu 19-09-19 21:13:32, Hillf Danton wrote:
> >
> > Currently memory controler is playing increasingly important role in
> > how memory is used and how pages are reclaimed on memory pressure.
> >
> > In daily works memcg is often created for critical tasks and their pre
> > configured memory usage is supposed to be met even on memory pressure.
> > Administrator wants to make it configurable that the pages consumed by
> > memcg-B can be reclaimed by page allocations invoked not by memcg-A but
> > by memcg-C.
> 
> I am not really sure I understand the usecase well but this sounds like
> what memory reclaim protection in v2 is aiming at.
> 
A tipoint to the v2 stuff please.

> > That configurability is addressed by adding priority for soft limit
> > reclaiming to make sure that no pages will be reclaimed from memcg of
> > higer priortiy in favor of memcg of lower priority.
> 
> cgroup v1 interfaces are generally frozen and mostly aimed at backward
> compatibility. I am especially concerned about adding a new way to
> control soft limit which is known to be misdesigned and unfixable to
> behave reasonably.
>
An URL to the drafts/works about the new way in your git tree.

Thanks
Hillf



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] mm: memcg: add priority for soft limit reclaiming
  2019-09-19 13:13 Hillf Danton
@ 2019-09-19 13:32 ` Michal Hocko
  2019-09-23 13:04   ` Hillf Danton
  0 siblings, 1 reply; 9+ messages in thread
From: Michal Hocko @ 2019-09-19 13:32 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Johannes Weiner, Andrew Morton, linux-kernel, linux-mm,
	Shakeel Butt, Roman Gushchin, Matthew Wilcox

On Thu 19-09-19 21:13:32, Hillf Danton wrote:
> 
> Currently memory controler is playing increasingly important role in
> how memory is used and how pages are reclaimed on memory pressure.
> 
> In daily works memcg is often created for critical tasks and their pre
> configured memory usage is supposed to be met even on memory pressure.
> Administrator wants to make it configurable that the pages consumed by
> memcg-B can be reclaimed by page allocations invoked not by memcg-A but
> by memcg-C.

I am not really sure I understand the usecase well but this sounds like
what memory reclaim protection in v2 is aiming at.
 
> That configurability is addressed by adding priority for soft limit
> reclaiming to make sure that no pages will be reclaimed from memcg of
> higer priortiy in favor of memcg of lower priority.

cgroup v1 interfaces are generally frozen and mostly aimed at backward
compatibility. I am especially concerned about adding a new way to
control soft limit which is known to be misdesigned and unfixable to
behave reasonably.

> Pages are reclaimed with no priority being taken into account by default
> unless user turns it on, and then they are responsible for their smart
> activities almost the same way as they play realtime FIFO/RR games.
> 
> Priority is available only in the direct reclaiming context in order to
> advoid churning in the complex kswapd behavior.
> 
> Cc: Shakeel Butt <shakeelb@google.com>
> Cc: Roman Gushchin <guro@fb.com>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@suse.cz>
> Signed-off-by: Hillf Danton <hdanton@sina.com>

That being said, you should describe the usecase and explain why v2
interface is not providing what you need. We might think about where to
go from there but extending the soft limit reclaim is almost certainly
not the right way to go.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC] mm: memcg: add priority for soft limit reclaiming
@ 2019-09-19 13:13 Hillf Danton
  2019-09-19 13:32 ` Michal Hocko
  0 siblings, 1 reply; 9+ messages in thread
From: Hillf Danton @ 2019-09-19 13:13 UTC (permalink / raw)
  To: Michal Hocko, Johannes Weiner
  Cc: Andrew Morton, linux-kernel, linux-mm, Shakeel Butt,
	Roman Gushchin, Matthew Wilcox, Hillf Danton


Currently memory controler is playing increasingly important role in
how memory is used and how pages are reclaimed on memory pressure.

In daily works memcg is often created for critical tasks and their pre
configured memory usage is supposed to be met even on memory pressure.
Administrator wants to make it configurable that the pages consumed by
memcg-B can be reclaimed by page allocations invoked not by memcg-A but
by memcg-C.

That configurability is addressed by adding priority for soft limit
reclaiming to make sure that no pages will be reclaimed from memcg of
higer priortiy in favor of memcg of lower priority.

Pages are reclaimed with no priority being taken into account by default
unless user turns it on, and then they are responsible for their smart
activities almost the same way as they play realtime FIFO/RR games.

Priority is available only in the direct reclaiming context in order to
advoid churning in the complex kswapd behavior.

Cc: Shakeel Butt <shakeelb@google.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Hillf Danton <hdanton@sina.com>
---

--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -230,6 +230,21 @@ struct mem_cgroup {
 	int		under_oom;
 
 	int	swappiness;
+	/*
+	 * slrp, soft limit reclaiming priority
+	 *
+	 * 0, by default, no slrp considered on soft reclaiming.
+	 *
+	 * 1-32, user configurable in ascending order,
+	 * 	no page will be reclaimed from memcg of higher slrp in
+	 * 	favor of memcg of lower slrp.
+	 *
+	 * only in direct reclaiming context now.
+	 */
+	int	slrp;
+#define MEMCG_SLRP_MIN 1
+#define MEMCG_SLRP_MAX 32
+
 	/* OOM-Killer disable */
 	int		oom_kill_disable;
 
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -647,7 +647,8 @@ static void mem_cgroup_remove_from_trees
 }
 
 static struct mem_cgroup_per_node *
-__mem_cgroup_largest_soft_limit_node(struct mem_cgroup_tree_per_node *mctz)
+__mem_cgroup_largest_soft_limit_node(struct mem_cgroup_tree_per_node *mctz,
+					int slrp)
 {
 	struct mem_cgroup_per_node *mz;
 
@@ -664,7 +665,7 @@ retry:
 	 * position in the tree.
 	 */
 	__mem_cgroup_remove_exceeded(mz, mctz);
-	if (!soft_limit_excess(mz->memcg) ||
+	if (!soft_limit_excess(mz->memcg) || mz->memcg->slrp > slrp ||
 	    !css_tryget_online(&mz->memcg->css))
 		goto retry;
 done:
@@ -672,12 +673,13 @@ done:
 }
 
 static struct mem_cgroup_per_node *
-mem_cgroup_largest_soft_limit_node(struct mem_cgroup_tree_per_node *mctz)
+mem_cgroup_largest_soft_limit_node(struct mem_cgroup_tree_per_node *mctz,
+					int slrp)
 {
 	struct mem_cgroup_per_node *mz;
 
 	spin_lock_irq(&mctz->lock);
-	mz = __mem_cgroup_largest_soft_limit_node(mctz);
+	mz = __mem_cgroup_largest_soft_limit_node(mctz, slrp);
 	spin_unlock_irq(&mctz->lock);
 	return mz;
 }
@@ -2972,6 +2974,31 @@ static int mem_cgroup_resize_max(struct
 	return ret;
 }
 
+static int mem_cgroup_get_slrp(void)
+{
+	int slrp;
+
+	if (current->flags & PF_KTHREAD) {
+		/*
+		 * now slrp does not churn in background reclaiming to
+		 * make life simple
+		 */
+		slrp = 0;
+	} else {
+		struct mem_cgroup *memcg;
+
+		rcu_read_lock();
+		memcg = mem_cgroup_from_task(current);
+		if (!memcg || memcg == root_mem_cgroup)
+			slrp = 0;
+		else
+			slrp = memcg->slrp;
+		rcu_read_unlock();
+	}
+
+	return slrp;
+}
+
 unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
 					    gfp_t gfp_mask,
 					    unsigned long *total_scanned)
@@ -2980,6 +3007,7 @@ unsigned long mem_cgroup_soft_limit_recl
 	struct mem_cgroup_per_node *mz, *next_mz = NULL;
 	unsigned long reclaimed;
 	int loop = 0;
+	int slrp;
 	struct mem_cgroup_tree_per_node *mctz;
 	unsigned long excess;
 	unsigned long nr_scanned;
@@ -2997,6 +3025,7 @@ unsigned long mem_cgroup_soft_limit_recl
 	if (!mctz || RB_EMPTY_ROOT(&mctz->rb_root))
 		return 0;
 
+	slrp = mem_cgroup_get_slrp();
 	/*
 	 * This loop can run a while, specially if mem_cgroup's continuously
 	 * keep exceeding their soft limit and putting the system under
@@ -3006,7 +3035,7 @@ unsigned long mem_cgroup_soft_limit_recl
 		if (next_mz)
 			mz = next_mz;
 		else
-			mz = mem_cgroup_largest_soft_limit_node(mctz);
+			mz = mem_cgroup_largest_soft_limit_node(mctz, slrp);
 		if (!mz)
 			break;
 
@@ -3024,8 +3053,8 @@ unsigned long mem_cgroup_soft_limit_recl
 		 */
 		next_mz = NULL;
 		if (!reclaimed)
-			next_mz = __mem_cgroup_largest_soft_limit_node(mctz);
-
+			next_mz = __mem_cgroup_largest_soft_limit_node(mctz,
+							slrp);
 		excess = soft_limit_excess(mz->memcg);
 		/*
 		 * One school of thought says that we should not add
@@ -5817,6 +5846,37 @@ static ssize_t memory_oom_group_write(st
 	return nbytes;
 }
 
+static int memory_slrp_show(struct seq_file *m, void *v)
+{
+	struct mem_cgroup *memcg = mem_cgroup_from_seq(m);
+
+	seq_printf(m, "%d\n", memcg->slrp);
+
+	return 0;
+}
+
+static ssize_t memory_slrp_write(struct kernfs_open_file *of,
+				char *buf, size_t nbytes, loff_t off)
+{
+	struct mem_cgroup *memcg = mem_cgroup_from_css(of_css(of));
+	int ret, slrp;
+
+	buf = strstrip(buf);
+	if (!buf)
+		return -EINVAL;
+
+	ret = kstrtoint(buf, 0, &slrp);
+	if (ret)
+		return ret;
+
+	if (slrp < MEMCG_SLRP_MIN || MEMCG_SLRP_MAX < slrp)
+		return -EINVAL;
+
+	memcg->slrp = slrp;
+
+	return nbytes;
+}
+
 static struct cftype memory_files[] = {
 	{
 		.name = "current",
@@ -5870,6 +5930,12 @@ static struct cftype memory_files[] = {
 		.seq_show = memory_oom_group_show,
 		.write = memory_oom_group_write,
 	},
+	{
+		.name = "slrp",
+		.flags = CFTYPE_NOT_ON_ROOT | CFTYPE_NS_DELEGATABLE,
+		.seq_show = memory_slrp_show,
+		.write = memory_slrp_write,
+	},
 	{ }	/* terminate */
 };
 



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-09-25  6:52 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-24  7:36 [RFC] mm: memcg: add priority for soft limit reclaiming Hillf Danton
2019-09-24 13:30 ` Michal Hocko
2019-09-24 17:23   ` Roman Gushchin
2019-09-25  2:35   ` Hillf Danton
2019-09-25  6:52     ` Michal Hocko
  -- strict thread matches above, loose matches on Subject: below --
2019-09-19 13:13 Hillf Danton
2019-09-19 13:32 ` Michal Hocko
2019-09-23 13:04   ` Hillf Danton
2019-09-23 13:28     ` Michal Hocko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.