linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Heiko Carstens <heiko.carstens@de.ibm.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Josef Bacik <josef@toxicpanda.com>,
	Michal Hocko <mhocko@suse.com>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>,
	linux-mm@kvack.org, cgroups@vger.kernel.org,
	linux-kernel@vger.kernel.org, kernel-team@fb.com,
	linux-s390@vger.kernel.org
Subject: Re: [PATCH 2/6] mm: vmstat: move slab statistics from zone to node counters
Date: Wed, 31 May 2017 11:12:56 +0200	[thread overview]
Message-ID: <20170531091256.GA5914@osiris> (raw)
In-Reply-To: <20170530181724.27197-3-hannes@cmpxchg.org>

On Tue, May 30, 2017 at 02:17:20PM -0400, Johannes Weiner wrote:
> To re-implement slab cache vs. page cache balancing, we'll need the
> slab counters at the lruvec level, which, ever since lru reclaim was
> moved from the zone to the node, is the intersection of the node, not
> the zone, and the memcg.
> 
> We could retain the per-zone counters for when the page allocator
> dumps its memory information on failures, and have counters on both
> levels - which on all but NUMA node 0 is usually redundant. But let's
> keep it simple for now and just move them. If anybody complains we can
> restore the per-zone counters.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

This patch causes an early boot crash on s390 (linux-next as of today).
CONFIG_NUMA on/off doesn't make any difference. I haven't looked any
further into this yet, maybe you have an idea?

Kernel BUG at 00000000002b0362 [verbose debug info unavailable]
addressing exception: 0005 ilc:3 [#1] SMP
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 4.12.0-rc3-00153-gb6bc6724488a #16
Hardware name: IBM 2964 N96 702 (z/VM 6.4.0)
task: 0000000000d75d00 task.stack: 0000000000d60000
Krnl PSW : 0404200180000000 00000000002b0362 (mod_node_page_state+0x62/0x158)
           R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:0 CC:2 PM:0 RI:0 EA:3
Krnl GPRS: 0000000000000001 000000003d81f000 0000000000000000 0000000000000006
           0000000000000001 0000000000f29b52 0000000000000041 0000000000000000
           0000000000000007 0000000000000040 000000003fe81000 000003d100ffa000
           0000000000ee1cd0 0000000000979040 0000000000300abc 0000000000d63c90
Krnl Code: 00000000002b0350: e31003900004 lg %r1,912
           00000000002b0356: e320f0a80004 lg %r2,168(%r15)
          #00000000002b035c: e31120000090 llgc %r1,0(%r1,%r2)
          >00000000002b0362: b9060011  lgbr %r1,%r1
           00000000002b0366: e32003900004 lg %r2,912
           00000000002b036c: e3c280000090 llgc %r12,0(%r2,%r8)
           00000000002b0372: b90600ac  lgbr %r10,%r12
           00000000002b0376: b904002a  lgr %r2,%r10
Call Trace:
([<0000000000000000>]           (null))
 [<0000000000300abc>] new_slab+0x35c/0x628
 [<000000000030740c>] __kmem_cache_create+0x33c/0x638
 [<0000000000e99c0e>] create_boot_cache+0xae/0xe0
 [<0000000000e9e12c>] kmem_cache_init+0x5c/0x138
 [<0000000000e7999c>] start_kernel+0x24c/0x440
 [<0000000000100020>] _stext+0x20/0x80
Last Breaking-Event-Address:
 [<0000000000300ab6>] new_slab+0x356/0x628

Kernel panic - not syncing: Fatal exception: panic_on_oops

> diff --git a/drivers/base/node.c b/drivers/base/node.c
> index 5548f9686016..e57e06e6df4c 100644
> --- a/drivers/base/node.c
> +++ b/drivers/base/node.c
> @@ -129,11 +129,11 @@ static ssize_t node_read_meminfo(struct device *dev,
>  		       nid, K(node_page_state(pgdat, NR_UNSTABLE_NFS)),
>  		       nid, K(sum_zone_node_page_state(nid, NR_BOUNCE)),
>  		       nid, K(node_page_state(pgdat, NR_WRITEBACK_TEMP)),
> -		       nid, K(sum_zone_node_page_state(nid, NR_SLAB_RECLAIMABLE) +
> -				sum_zone_node_page_state(nid, NR_SLAB_UNRECLAIMABLE)),
> -		       nid, K(sum_zone_node_page_state(nid, NR_SLAB_RECLAIMABLE)),
> +		       nid, K(node_page_state(pgdat, NR_SLAB_RECLAIMABLE) +
> +			      node_page_state(pgdat, NR_SLAB_UNRECLAIMABLE)),
> +		       nid, K(node_page_state(pgdat, NR_SLAB_RECLAIMABLE)),
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> -		       nid, K(sum_zone_node_page_state(nid, NR_SLAB_UNRECLAIMABLE)),
> +		       nid, K(node_page_state(pgdat, NR_SLAB_UNRECLAIMABLE)),
>  		       nid, K(node_page_state(pgdat, NR_ANON_THPS) *
>  				       HPAGE_PMD_NR),
>  		       nid, K(node_page_state(pgdat, NR_SHMEM_THPS) *
> @@ -141,7 +141,7 @@ static ssize_t node_read_meminfo(struct device *dev,
>  		       nid, K(node_page_state(pgdat, NR_SHMEM_PMDMAPPED) *
>  				       HPAGE_PMD_NR));
>  #else
> -		       nid, K(sum_zone_node_page_state(nid, NR_SLAB_UNRECLAIMABLE)));
> +		       nid, K(node_page_state(pgdat, NR_SLAB_UNRECLAIMABLE)));
>  #endif
>  	n += hugetlb_report_node_meminfo(nid, buf + n);
>  	return n;
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index ebaccd4e7d8c..eacadee83964 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -125,8 +125,6 @@ enum zone_stat_item {
>  	NR_ZONE_UNEVICTABLE,
>  	NR_ZONE_WRITE_PENDING,	/* Count of dirty, writeback and unstable pages */
>  	NR_MLOCK,		/* mlock()ed pages found and moved off LRU */
> -	NR_SLAB_RECLAIMABLE,
> -	NR_SLAB_UNRECLAIMABLE,
>  	NR_PAGETABLE,		/* used for pagetables */
>  	NR_KERNEL_STACK_KB,	/* measured in KiB */
>  	/* Second 128 byte cacheline */
> @@ -152,6 +150,8 @@ enum node_stat_item {
>  	NR_INACTIVE_FILE,	/*  "     "     "   "       "         */
>  	NR_ACTIVE_FILE,		/*  "     "     "   "       "         */
>  	NR_UNEVICTABLE,		/*  "     "     "   "       "         */
> +	NR_SLAB_RECLAIMABLE,
> +	NR_SLAB_UNRECLAIMABLE,
>  	NR_ISOLATED_ANON,	/* Temporary isolated pages from anon lru */
>  	NR_ISOLATED_FILE,	/* Temporary isolated pages from file lru */
>  	WORKINGSET_REFAULT,
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index f9e450c6b6e4..5f89cfaddc4b 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -4601,8 +4601,6 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask)
>  			" present:%lukB"
>  			" managed:%lukB"
>  			" mlocked:%lukB"
> -			" slab_reclaimable:%lukB"
> -			" slab_unreclaimable:%lukB"
>  			" kernel_stack:%lukB"
>  			" pagetables:%lukB"
>  			" bounce:%lukB"
> @@ -4624,8 +4622,6 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask)
>  			K(zone->present_pages),
>  			K(zone->managed_pages),
>  			K(zone_page_state(zone, NR_MLOCK)),
> -			K(zone_page_state(zone, NR_SLAB_RECLAIMABLE)),
> -			K(zone_page_state(zone, NR_SLAB_UNRECLAIMABLE)),
>  			zone_page_state(zone, NR_KERNEL_STACK_KB),
>  			K(zone_page_state(zone, NR_PAGETABLE)),
>  			K(zone_page_state(zone, NR_BOUNCE)),
> diff --git a/mm/slab.c b/mm/slab.c
> index 2a31ee3c5814..b55853399559 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -1425,10 +1425,10 @@ static struct page *kmem_getpages(struct kmem_cache *cachep, gfp_t flags,
>  
>  	nr_pages = (1 << cachep->gfporder);
>  	if (cachep->flags & SLAB_RECLAIM_ACCOUNT)
> -		add_zone_page_state(page_zone(page),
> +		add_node_page_state(page_pgdat(page),
>  			NR_SLAB_RECLAIMABLE, nr_pages);
>  	else
> -		add_zone_page_state(page_zone(page),
> +		add_node_page_state(page_pgdat(page),
>  			NR_SLAB_UNRECLAIMABLE, nr_pages);
>  
>  	__SetPageSlab(page);
> @@ -1459,10 +1459,10 @@ static void kmem_freepages(struct kmem_cache *cachep, struct page *page)
>  	kmemcheck_free_shadow(page, order);
>  
>  	if (cachep->flags & SLAB_RECLAIM_ACCOUNT)
> -		sub_zone_page_state(page_zone(page),
> +		sub_node_page_state(page_pgdat(page),
>  				NR_SLAB_RECLAIMABLE, nr_freed);
>  	else
> -		sub_zone_page_state(page_zone(page),
> +		sub_node_page_state(page_pgdat(page),
>  				NR_SLAB_UNRECLAIMABLE, nr_freed);
>  
>  	BUG_ON(!PageSlab(page));
> diff --git a/mm/slub.c b/mm/slub.c
> index 57e5156f02be..673e72698d9b 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -1615,7 +1615,7 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)
>  	if (!page)
>  		return NULL;
>  
> -	mod_zone_page_state(page_zone(page),
> +	mod_node_page_state(page_pgdat(page),
>  		(s->flags & SLAB_RECLAIM_ACCOUNT) ?
>  		NR_SLAB_RECLAIMABLE : NR_SLAB_UNRECLAIMABLE,
>  		1 << oo_order(oo));
> @@ -1655,7 +1655,7 @@ static void __free_slab(struct kmem_cache *s, struct page *page)
>  
>  	kmemcheck_free_shadow(page, compound_order(page));
>  
> -	mod_zone_page_state(page_zone(page),
> +	mod_node_page_state(page_pgdat(page),
>  		(s->flags & SLAB_RECLAIM_ACCOUNT) ?
>  		NR_SLAB_RECLAIMABLE : NR_SLAB_UNRECLAIMABLE,
>  		-pages);
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index c5f9d1673392..5d187ee618c0 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -3815,7 +3815,7 @@ int node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned int order)
>  	 * unmapped file backed pages.
>  	 */
>  	if (node_pagecache_reclaimable(pgdat) <= pgdat->min_unmapped_pages &&
> -	    sum_zone_node_page_state(pgdat->node_id, NR_SLAB_RECLAIMABLE) <= pgdat->min_slab_pages)
> +	    node_page_state(pgdat, NR_SLAB_RECLAIMABLE) <= pgdat->min_slab_pages)
>  		return NODE_RECLAIM_FULL;
>  
>  	/*
> diff --git a/mm/vmstat.c b/mm/vmstat.c
> index 76f73670200a..a64f1c764f17 100644
> --- a/mm/vmstat.c
> +++ b/mm/vmstat.c
> @@ -928,8 +928,6 @@ const char * const vmstat_text[] = {
>  	"nr_zone_unevictable",
>  	"nr_zone_write_pending",
>  	"nr_mlock",
> -	"nr_slab_reclaimable",
> -	"nr_slab_unreclaimable",
>  	"nr_page_table_pages",
>  	"nr_kernel_stack",
>  	"nr_bounce",
> @@ -952,6 +950,8 @@ const char * const vmstat_text[] = {
>  	"nr_inactive_file",
>  	"nr_active_file",
>  	"nr_unevictable",
> +	"nr_slab_reclaimable",
> +	"nr_slab_unreclaimable",
>  	"nr_isolated_anon",
>  	"nr_isolated_file",
>  	"workingset_refault",
> -- 
> 2.12.2
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-05-31  9:13 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-30 18:17 [PATCH 0/6] mm: per-lruvec slab stats Johannes Weiner
2017-05-30 18:17 ` [PATCH 1/6] mm: vmscan: delete unused pgdat_reclaimable_pages() Johannes Weiner
2017-05-30 21:50   ` Andrew Morton
2017-05-30 22:02     ` Johannes Weiner
2017-05-30 18:17 ` [PATCH 2/6] mm: vmstat: move slab statistics from zone to node counters Johannes Weiner
2017-05-31  9:12   ` Heiko Carstens [this message]
2017-05-31 11:39     ` Heiko Carstens
2017-05-31 17:11       ` Yury Norov
2017-06-01 10:07         ` Michael Ellerman
2017-06-05 18:35           ` Johannes Weiner
2017-06-05 21:38             ` Andrew Morton
2017-06-07 16:20               ` Johannes Weiner
2017-06-06  4:31             ` Michael Ellerman
2017-06-06 11:15               ` Michael Ellerman
2017-06-06 14:33                 ` Johannes Weiner
2017-05-30 18:17 ` [PATCH 3/6] mm: memcontrol: use the node-native slab memory counters Johannes Weiner
2017-06-03 17:39   ` Vladimir Davydov
2017-05-30 18:17 ` [PATCH 4/6] mm: memcontrol: use generic mod_memcg_page_state for kmem pages Johannes Weiner
2017-06-03 17:40   ` Vladimir Davydov
2017-05-30 18:17 ` [PATCH 5/6] mm: memcontrol: per-lruvec stats infrastructure Johannes Weiner
2017-05-31 17:14   ` Johannes Weiner
2017-05-31 18:18     ` Andrew Morton
2017-05-31 19:02       ` Tony Lindgren
2017-05-31 22:03         ` Stephen Rothwell
2017-06-01  1:44       ` Johannes Weiner
2017-06-03 17:50   ` Vladimir Davydov
2017-06-05 17:53     ` Johannes Weiner
2017-05-30 18:17 ` [PATCH 6/6] mm: memcontrol: account slab stats per lruvec Johannes Weiner
2017-06-03 17:54   ` Vladimir Davydov
2017-06-05 16:52   ` [6/6] " Guenter Roeck
2017-06-05 17:52     ` Johannes Weiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170531091256.GA5914@osiris \
    --to=heiko.carstens@de.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=josef@toxicpanda.com \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=mhocko@suse.com \
    --cc=riel@redhat.com \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).