linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Hillf Danton <hdanton@sina.com>
Cc: linux-mm <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Shakeel Butt <shakeelb@google.com>,
	Minchan Kim <minchan@kernel.org>, Mel Gorman <mgorman@suse.de>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	Jan Kara <jack@suse.cz>
Subject: Re: [RFC v1] mm: add page preemption
Date: Wed, 23 Oct 2019 10:17:29 +0200	[thread overview]
Message-ID: <20191023081729.GI754@dhcp22.suse.cz> (raw)
In-Reply-To: <20191022142802.14304-1-hdanton@sina.com>

On Tue 22-10-19 22:28:02, Hillf Danton wrote:
> 
> On Tue, 22 Oct 2019 14:42:41 +0200 Michal Hocko wrote:
> > 
> > On Tue 22-10-19 20:14:39, Hillf Danton wrote:
> > > 
> > > On Mon, 21 Oct 2019 14:27:28 +0200 Michal Hocko wrote:
> > [...]
> > > > Why do we care and which workloads would benefit and how much.
> > > 
> > > Page preemption, disabled by default, should be turned on by those
> > > who wish that the performance of their workloads can survive memory
> > > pressure to certain extent.
> > 
> > I am sorry but this doesn't say anything to me. How come not all
> > workloads would fit that description?
> 
> That means pp plays a role when kswapd becomes active, and it may
> prevent too much jitters in active lru pages.

This is still too vague to be useful in any way.

> > > The number of pp users is supposed near the people who change the
> > > nice value of their apps either to -1 or higher at least once a week,
> > > less than vi users among UK's undergraduates.
> > > 
> > > > And last but not least why the existing infrastructure doesn't help
> > > > (e.g. if you have clearly defined workloads with different
> > > > memory consumption requirements then why don't you use memory cgroups to
> > > > reflect the priority).
> > > 
> > > Good question:)
> > > 
> > > Though pp is implemented by preventing any task from reclaiming as many
> > > pages as possible from other tasks that are higher on priority, it is
> > > trying to introduce prio into page reclaiming, to add a feature.
> > > 
> > > Page and memcg are different objects after all; pp is being added at
> > > the page granularity. It should be an option available in environments
> > > without memcg enabled.
> > 
> > So do you actually want to establish LRUs per priority?
> 
> No, no change other than the prio for every lru page was added. LRU per prio
> is too much to implement.

Well, considering that per page priority is a no go as already pointed
out by Willy then you do not have other choice right?

> > Why using memcgs is not an option?
> 
> I have plan to add prio in memcg. As you see, I sent a rfc before v0 with
> nice added in memcg, and realised a couple days ago that its dependence on
> soft limit reclaim is not acceptable.
> 
> But we can't do that without determining how to define memcg's prio.
> What is in mind now is the highest (or lowest) prio of tasks in a memcg
> with a knob offered to userspace.
> 
> If you like, I want to have a talk about it sometime later.

This doesn't really answer my question. Why cannot you use memcgs as
they are now. Why exactly do you need a fixed priority?

> > This is the main facility to partition reclaimable
> > memory in the first place. You should really focus on explaining on why
> > a much more fine grained control is needed much more thoroughly.
> > 
> > > What is way different from the protections offered by memory cgroup
> > > is that pages protected by memcg:min/low can't be reclaimed regardless
> > > of memory pressure. Such guarantee is not available under pp as it only
> > > suggests an extra factor to consider on deactivating lru pages.
> > 
> > Well, low limit can be breached if there is no eliglible memcg to be
> > reclaimed. That means that you can shape some sort of priority by
> > setting the low limit already.
> > 
> > [...]
> > 
> > > What was added on the reclaimer side is
> > > 
> > > 1, kswapd sets pgdat->kswapd_prio, the switch between page reclaimer
> > >    and allocator in terms of prio, to the lowest value before taking
> > >    a nap.
> > > 
> > > 2, any allocator is able to wake up the reclaimer because of the
> > >    lowest prio, and it starts reclaiming pages using the waker's prio.
> > > 
> > > 3, allocator comes while kswapd is active, its prio is checked and
> > >    no-op if kswapd is higher on prio; otherwise switch is updated
> > >    with the higher prio.
> > > 
> > > 4, every time kswapd raises sc.priority that starts with DEF_PRIORITY,
> > >    it is checked if there is pending update of switch; and kswapd's
> > >    prio steps up if there is a pending one, thus its prio never steps
> > >    down. Nor prio inversion. 
> > > 
> > > 5, goto 1 when kswapd finishes its work.
> > 
> > What about the direct reclaim?
> 
> Their prio will not change before reclaiming finishes, so leave it be.

This doesn't answer my question.

> > What if pages of a lower priority are
> > hard to reclaim? Do you want a process of a higher priority stall more
> > just because it has to wait for those lower priority pages?
> 
> The problems above are not introduced by pp, let Mr. Kswapd take care of
> them.

No, this is not an answer.

-- 
Michal Hocko
SUSE Labs

  parent reply	other threads:[~2019-10-23  8:17 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20191020134304.11700-1-hdanton@sina.com>
2019-10-20 15:24 ` [RFC v1] mm: add page preemption Matthew Wilcox
2019-10-21 12:27 ` Michal Hocko
     [not found] ` <20191022121439.7164-1-hdanton@sina.com>
2019-10-22 12:42   ` Michal Hocko
     [not found]   ` <20191022142802.14304-1-hdanton@sina.com>
2019-10-23  8:17     ` Michal Hocko [this message]
     [not found]     ` <20191023115350.4956-1-hdanton@sina.com>
2019-10-23 12:04       ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191023081729.GI754@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=hdanton@sina.com \
    --cc=jack@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=minchan@kernel.org \
    --cc=shakeelb@google.com \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).