All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Huang, Ying" <ying.huang@intel.com>
To: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Christoph Lameter <cl@linux.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	Vlastimil Babka <vbabka@suse.cz>
Subject: Re: [PATCH] mm: fix draining remote pageset
Date: Tue, 22 Aug 2023 06:31:42 +0800	[thread overview]
Message-ID: <87msykc9ip.fsf@yhuang6-desk2.ccr.corp.intel.com> (raw)
In-Reply-To: <ZOMuCiZ07N+L/ljG@dhcp22.suse.cz> (Michal Hocko's message of "Mon, 21 Aug 2023 11:27:38 +0200")

Michal Hocko <mhocko@suse.com> writes:

> On Mon 21-08-23 16:30:18, Huang, Ying wrote:
>> Michal Hocko <mhocko@suse.com> writes:
>> 
>> > On Wed 16-08-23 15:08:23, Huang, Ying wrote:
>> >> Michal Hocko <mhocko@suse.com> writes:
>> >> 
>> >> > On Mon 14-08-23 09:59:51, Huang, Ying wrote:
>> >> >> Hi, Michal,
>> >> >> 
>> >> >> Michal Hocko <mhocko@suse.com> writes:
>> >> >> 
>> >> >> > On Fri 11-08-23 17:08:19, Huang Ying wrote:
>> >> >> >> If there is no memory allocation/freeing in the remote pageset after
>> >> >> >> some time (3 seconds for now), the remote pageset will be drained to
>> >> >> >> avoid memory wastage.
>> >> >> >> 
>> >> >> >> But in the current implementation, vmstat updater worker may not be
>> >> >> >> re-queued when we are waiting for the timeout (pcp->expire != 0) if
>> >> >> >> there are no vmstat changes, for example, when CPU goes idle.
>> >> >> >
>> >> >> > Why is that a problem?
>> >> >> 
>> >> >> The pages of the remote zone may be kept in the local per-CPU pageset
>> >> >> for long time as long as there's no page allocation/freeing on the
>> >> >> logical CPU.  In addition to the logical CPU goes idle, this is also
>> >> >> possible if the logical CPU is busy in the user space.
>> >> >
>> >> > But why is this a problem? Is the scale of the problem sufficient to
>> >> > trigger out of memory situations or be otherwise harmful?
>> >> 
>> >> This may trigger premature page reclaiming.  The pages in the PCP of the
>> >> remote zone would have been freed to satisfy the page allocation for the
>> >> remote zone to avoid page reclaiming.  It's highly possible that the
>> >> local CPU just allocate/free from/to the remote zone temporarily.
>> >
>> > I am slightly confused here but I suspect by zone you mean remote pcp.
>> > But more importantly is this a concern seen in real workload? Can you
>> > quantify it in some manner? E.g. with this patch we have X more kswapd
>> > scanning or even hit direct reclaim much less often.
>> >> So,
>> >> we should free PCP pages of the remote zone if there is no page
>> >> allocation/freeing from/to the remote zone for 3 seconds.
>> >
>> > Well, I would argue this depends a lot. There are workloads which really
>> > like to have CPUs idle and yet they would like to benefit from the
>> > allocator fast path after that CPU goes out of idle because idling is
>> > their power saving opportunity while workloads want to act quickly after
>> > there is something to run.
>> >
>> > That being said, we really need some numbers (ideally from real world)
>> > that proves this is not just a theoretical concern.
>> 
>> The behavior to drain the PCP of the remote zone (that is, remote PCP)
>> was introduced in commit 4ae7c03943fc ("[PATCH] Periodically drain non
>> local pagesets").  The goal of draining was well documented in the
>> change log.  IIUC, some of your questions can be answered there?
>> 
>> This patch just restores the original behavior changed by commit
>> 7cc36bbddde5 ("vmstat: on-demand vmstat workers V8").
>
> Let me repeat. You need some numbers to show this is needed.

I have done some test for this patch as follows,

- Run some workloads, use `numactl` to bind CPU to node 0 and memory to
  node 1.  So the PCP of the CPU on node 0 for zone on node 1 will be
  filled.

- After workloads finish, idle for 60s

- Check /proc/zoneinfo

With the original kernel, the number of pages in the PCP of the CPU on
node 0 for zone on node 1 is non-zero after idle.  With the patched
kernel, that becomes 0 after idle.  We avoid to keep pages in the remote
PCP during idle.

This is the number I have.  If you think it isn't enough to justify the
patch, then I'm OK too (although I think it's enough).  Because the
remote PCP will be drained later when some pages are allocated/freed on
the CPU.

--
Best Regards,
Huang, Ying

  reply	other threads:[~2023-08-21 22:34 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-11  9:08 [PATCH] mm: fix draining remote pageset Huang Ying
2023-08-11  9:35 ` Michal Hocko
2023-08-14  1:59   ` Huang, Ying
2023-08-16  6:49     ` Michal Hocko
2023-08-16  7:08       ` Huang, Ying
2023-08-16 20:23         ` Lameter, Christopher
2023-08-21  7:55         ` Michal Hocko
2023-08-21  8:30           ` Huang, Ying
2023-08-21  9:27             ` Michal Hocko
2023-08-21 22:31               ` Huang, Ying [this message]
2023-08-22  8:09                 ` Michal Hocko
2023-08-25 17:06                   ` Lameter, Christopher
2023-08-29  6:08                     ` Huang, Ying
2023-08-29 18:05                       ` Lameter, Christopher
2023-09-05 16:52                     ` Vlastimil Babka
2023-09-06  4:17                       ` Huang, Ying

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87msykc9ip.fsf@yhuang6-desk2.ccr.corp.intel.com \
    --to=ying.huang@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.