All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gavin Shan <gwshan@linux.vnet.ibm.com>
To: Gavin Shan <gwshan@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>,
	Anton Blanchard <anton@samba.org>,
	linuxppc-dev@lists.ozlabs.org, mgorman@suse.de
Subject: Re: [PATCH] powerpc/mm: Fix RECLAIM_DISTANCE
Date: Tue, 31 Jan 2017 16:40:29 +1100	[thread overview]
Message-ID: <20170131054029.GA9937@gwshan> (raw)
In-Reply-To: <20170131050104.GB25724@gwshan>

On Tue, Jan 31, 2017 at 04:01:05PM +1100, Gavin Shan wrote:
>On Tue, Jan 31, 2017 at 08:11:39AM +1100, Michael Ellerman wrote:
>>Gavin Shan <gwshan@linux.vnet.ibm.com> writes:
>>
>>I'd like to see some test results from multi-node systems.
>>
>>I'd also like to understand what has changed since we changed
>>RECLAIM_DISTANCE in the first place, ie. why did it used to work and now
>>doesn't?
>>
>
>[Ccing Mel]
>
>Michael, thanks for review. I would like to explain a bit more. The issue
>addressed by the patch is irrelevant to the number of NUMA nodes. There
>is one procfs entry ("/proc/sys/vm/zone_reclaim_mode") which corresponds
>to variable @node_reclaim_mode (their names don't match!). it can have
>belowing bits or any combination of them. Its default value is RECLAIM_OFF (0).
>Note RECLAIM_ZONE was obsoleted and I will send one patch to remove it
>later.
>
>#define RECLAIM_OFF 0
>#define RECLAIM_ZONE (1<<0)     /* Run shrink_inactive_list on the zone */
>#define RECLAIM_WRITE (1<<1)    /* Writeout pages during reclaim */
>#define RECLAIM_UNMAP (1<<2)    /* Unmap pages during reclaim */
>
>When @node_reclaim_mode is set to (RECLAIM_WRITE | RECLAIM_UNMAP), node_reclaim()
>isn't called on the preferred node as the condition is false: zone_allows_reclaim(
>node-A, node-A). As I observed, the distance from node-A to node-A is 10, equal to
>RECLAIM_DISTANCE.
>
>static bool zone_allows_reclaim(struct zone *local_zone, struct zone *zone)
>{
>        return node_distance(zone_to_nid(local_zone), zone_to_nid(zone)) <
>                                RECLAIM_DISTANCE;
>}
>
>
>__alloc_pages_nodemask
>   get_page_from_freelist     <- WATERMARK_LOW
>      zone_watermark_fast     <- Assume the allocation is breaking WATERMARK_LOW
>         node_reclaim         <- @node_reclaim_node isn't 0 and
>                                 zone_allows_reclaim(preferred_zone, current_zone) returns true
>         __node_reclaim       <- SWAP, WRITEPAGE and UNMAP setting from @node_reclaim_node
>         shrink_node
>      buffered_rmqueue
>   __alloc_pages_slowpath
>      get_page_from_freelist        <- WATERMARK_MIN
>      __alloc_pages_direct_compact  <- If it's costly allocation (order > 3)
>      wake_all_kswapds
>      get_page_from_freelist        <- NO_WATERMARK, CPU local node is set to preferred one
>      __alloc_pages_direct_reclaim
>         __perform_reclaim
>         try_to_free_pages          <- WRITEPAGE + UNMAP + SWAP
>         do_try_to_free_pages
>         shrink_zones               <- Stop until priority (12) reaches to 0 or reclaimed enough
>         shrink_node
>      __alloc_pages_direct_compact
>
>
>Also, RECLAIM_DISTANCE is set to 30 in include/linux/topology.h. It's used when arch
>doesn't provide one. It's why I set this macro to 30 in this patch. This issue is 
>introduced by commit 5f7a75acdb2 ("mm: page_alloc: do not cache reclaim distances").
>In the patch, it had wrong replacement. So I would correct the wrong replacement
>alternatively. Or both of them. Which way do you think is the best? Maybe Mel also
>has thoughts.
>
>     39  static bool zone_allows_reclaim(struct zone *local_zone, struct zone *zone)
>     40  {
>     41 -       return node_isset(local_zone->node, zone->zone_pgdat->reclaim_nodes);
>     42 -}
>     43 -
>     44 -static void __paginginit init_zone_allows_reclaim(int nid)
>     45 -{
>     46 -       int i;
>     47 -
>     48 -       for_each_node_state(i, N_MEMORY)
>     49 -               if (node_distance(nid, i) <= RECLAIM_DISTANCE)
>     50 -                       node_set(i, NODE_DATA(nid)->reclaim_nodes);
>     51 +       return node_distance(zone_to_nid(local_zone), zone_to_nid(zone)) <
>     52 +                               RECLAIM_DISTANCE;
>     53  }
>

Sorry, to make it clear. The patch replaced "<=" with "<" wrongly :)

Thanks,
Gavin

  reply	other threads:[~2017-01-31  5:41 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-23 23:32 [PATCH] powerpc/mm: Fix RECLAIM_DISTANCE Gavin Shan
2017-01-25  3:57 ` Balbir Singh
2017-01-25  4:58   ` Gavin Shan
2017-01-27 12:49     ` Balbir Singh
2017-01-30  1:02       ` Anton Blanchard
2017-01-30  4:38         ` Gavin Shan
2017-01-30 21:11           ` Michael Ellerman
2017-01-31  5:01             ` Gavin Shan
2017-01-31  5:40               ` Gavin Shan [this message]
2017-02-07 23:40               ` Gavin Shan
2017-01-31  4:33         ` Gavin Shan
2017-01-31  4:58           ` Anton Blanchard
2017-01-31  5:30             ` Gavin Shan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170131054029.GA9937@gwshan \
    --to=gwshan@linux.vnet.ibm.com \
    --cc=anton@samba.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mgorman@suse.de \
    --cc=mpe@ellerman.id.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.