From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, HTML_MESSAGE,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE, SPF_PASS,UNPARSEABLE_RELAY,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 907D8C2BA83 for ; Sat, 8 Feb 2020 03:03:26 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1E48720715 for ; Sat, 8 Feb 2020 03:03:25 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1E48720715 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 66AE46B0003; Fri, 7 Feb 2020 22:03:25 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 61B8B6B0005; Fri, 7 Feb 2020 22:03:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4E1D36B0007; Fri, 7 Feb 2020 22:03:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0108.hostedemail.com [216.40.44.108]) by kanga.kvack.org (Postfix) with ESMTP id 2C5396B0003 for ; Fri, 7 Feb 2020 22:03:25 -0500 (EST) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id CE8502826 for ; Sat, 8 Feb 2020 03:03:24 +0000 (UTC) X-FDA: 76465463928.19.bead28_7495ef57b5b X-HE-Tag: bead28_7495ef57b5b X-Filterd-Recvd-Size: 17122 Received: from out30-44.freemail.mail.aliyun.com (out30-44.freemail.mail.aliyun.com [115.124.30.44]) by imf14.hostedemail.com (Postfix) with ESMTP for ; Sat, 8 Feb 2020 03:03:22 +0000 (UTC) X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R131e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01f04452;MF=wenyang@linux.alibaba.com;NM=1;PH=DS;RN=8;SR=0;TI=SMTPD_---0TpOTABz_1581130995; Received: from IT-C02W23QPG8WN.local(mailfrom:wenyang@linux.alibaba.com fp:SMTPD_---0TpOTABz_1581130995) by smtp.aliyun-inc.com(127.0.0.1); Sat, 08 Feb 2020 11:03:16 +0800 Subject: Re: [PATCH] mm/slub: Detach node lock from counting free objects To: Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton Cc: Xunlei Pang , linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <20200201031502.92218-1-wenyang@linux.alibaba.com> From: Wen Yang Message-ID: <5373ce28-c369-4e40-11dd-b269e4d2cb24@linux.alibaba.com> Date: Sat, 8 Feb 2020 11:03:15 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:68.0) Gecko/20100101 Thunderbird/68.1.1 MIME-Version: 1.0 In-Reply-To: <20200201031502.92218-1-wenyang@linux.alibaba.com> Content-Type: multipart/alternative; boundary="------------F846028C26FC0153DF0D5EA7" Content-Language: en-US X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This is a multi-part message in MIME format. --------------F846028C26FC0153DF0D5EA7 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable Hi, I would greatly appreciate it if you kindly give me some feedback on this= patch. -- Best wishes, Wen On 2020/2/1 11:15 =E4=B8=8A=E5=8D=88, Wen Yang wrote: > The lock, protecting the node partial list, is taken when couting the f= ree > objects resident in that list. It introduces locking contention when th= e > page(s) is moved between CPU and node partial lists in allocation path > on another CPU. So reading "/proc/slabinfo" can possibily block the sla= b > allocation on another CPU for a while, 200ms in extreme cases. If the > slab object is to carry network packet, targeting the far-end disk arra= y, > it causes block IO jitter issue. > > This fixes the block IO jitter issue by caching the total inuse objects= in > the node in advance. The value is retrieved without taking the node par= tial > list lock on reading "/proc/slabinfo". > > Signed-off-by: Wen Yang > Cc: Christoph Lameter > Cc: Pekka Enberg > Cc: David Rientjes > Cc: Joonsoo Kim > Cc: Andrew Morton > Cc: Xunlei Pang > Cc: linux-mm@kvack.org > Cc: linux-kernel@vger.kernel.org > --- > mm/slab.h | 1 + > mm/slub.c | 42 +++++++++++++++++++++++++----------------- > 2 files changed, 26 insertions(+), 17 deletions(-) > > diff --git a/mm/slab.h b/mm/slab.h > index 7e94700aa78c..27d22837f7ff 100644 > --- a/mm/slab.h > +++ b/mm/slab.h > @@ -619,6 +619,7 @@ struct kmem_cache_node { > #ifdef CONFIG_SLUB_DEBUG > atomic_long_t nr_slabs; > atomic_long_t total_objects; > + atomic_long_t total_inuse; > struct list_head full; > #endif > #endif > diff --git a/mm/slub.c b/mm/slub.c > index 503e11b1c4e1..67640e797550 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -1060,7 +1060,8 @@ static inline unsigned long node_nr_slabs(struct = kmem_cache_node *n) > return atomic_long_read(&n->nr_slabs); > } > =20 > -static inline void inc_slabs_node(struct kmem_cache *s, int node, int = objects) > +static inline void inc_slabs_node(struct kmem_cache *s, int node, int = objects, > + int inuse) > { > struct kmem_cache_node *n =3D get_node(s, node); > =20 > @@ -1073,14 +1074,17 @@ static inline void inc_slabs_node(struct kmem_c= ache *s, int node, int objects) > if (likely(n)) { > atomic_long_inc(&n->nr_slabs); > atomic_long_add(objects, &n->total_objects); > + atomic_long_add(inuse, &n->total_inuse); > } > } > -static inline void dec_slabs_node(struct kmem_cache *s, int node, int = objects) > +static inline void dec_slabs_node(struct kmem_cache *s, int node, int = objects, > + int inuse) > { > struct kmem_cache_node *n =3D get_node(s, node); > =20 > atomic_long_dec(&n->nr_slabs); > atomic_long_sub(objects, &n->total_objects); > + atomic_long_sub(inuse, &n->total_inuse); > } > =20 > /* Object debug checks for alloc/free paths */ > @@ -1395,9 +1399,11 @@ static inline unsigned long slabs_node(struct km= em_cache *s, int node) > static inline unsigned long node_nr_slabs(struct kmem_cache_node *n) > { return 0; } > static inline void inc_slabs_node(struct kmem_cache *s, int node, > - int objects) {} > + int objects, > + int inuse) {} > static inline void dec_slabs_node(struct kmem_cache *s, int node, > - int objects) {} > + int objects, > + int inuse) {} > =20 > #endif /* CONFIG_SLUB_DEBUG */ > =20 > @@ -1708,7 +1714,7 @@ static struct page *allocate_slab(struct kmem_cac= he *s, gfp_t flags, int node) > if (!page) > return NULL; > =20 > - inc_slabs_node(s, page_to_nid(page), page->objects); > + inc_slabs_node(s, page_to_nid(page), page->objects, page->inuse); > =20 > return page; > } > @@ -1768,7 +1774,9 @@ static void free_slab(struct kmem_cache *s, struc= t page *page) > =20 > static void discard_slab(struct kmem_cache *s, struct page *page) > { > - dec_slabs_node(s, page_to_nid(page), page->objects); > + int inuse =3D page->objects; > + > + dec_slabs_node(s, page_to_nid(page), page->objects, inuse); > free_slab(s, page); > } > =20 > @@ -2396,9 +2404,9 @@ static inline int node_match(struct page *page, i= nt node) > } > =20 > #ifdef CONFIG_SLUB_DEBUG > -static int count_free(struct page *page) > +static inline unsigned long node_nr_inuse(struct kmem_cache_node *n) > { > - return page->objects - page->inuse; > + return atomic_long_read(&n->total_inuse); > } > =20 > static inline unsigned long node_nr_objs(struct kmem_cache_node *n) > @@ -2448,14 +2456,14 @@ slab_out_of_memory(struct kmem_cache *s, gfp_t = gfpflags, int nid) > for_each_kmem_cache_node(s, node, n) { > unsigned long nr_slabs; > unsigned long nr_objs; > - unsigned long nr_free; > + unsigned long nr_inuse; > =20 > - nr_free =3D count_partial(n, count_free); > nr_slabs =3D node_nr_slabs(n); > nr_objs =3D node_nr_objs(n); > + nr_inuse =3D node_nr_inuse(n); > =20 > pr_warn(" node %d: slabs: %ld, objs: %ld, free: %ld\n", > - node, nr_slabs, nr_objs, nr_free); > + node, nr_slabs, nr_objs, nr_objs - nr_inuse); > } > #endif > } > @@ -3348,6 +3356,7 @@ init_kmem_cache_node(struct kmem_cache_node *n) > #ifdef CONFIG_SLUB_DEBUG > atomic_long_set(&n->nr_slabs, 0); > atomic_long_set(&n->total_objects, 0); > + atomic_long_set(&n->total_inuse, 0); > INIT_LIST_HEAD(&n->full); > #endif > } > @@ -3411,7 +3420,7 @@ static void early_kmem_cache_node_alloc(int node) > page->frozen =3D 0; > kmem_cache_node->node[node] =3D n; > init_kmem_cache_node(n); > - inc_slabs_node(kmem_cache_node, node, page->objects); > + inc_slabs_node(kmem_cache_node, node, page->objects, page->inuse); > =20 > /* > * No locks need to be taken here as it has just been > @@ -4857,8 +4866,7 @@ static ssize_t show_slab_objects(struct kmem_cach= e *s, > if (flags & SO_TOTAL) > x =3D atomic_long_read(&n->total_objects); > else if (flags & SO_OBJECTS) > - x =3D atomic_long_read(&n->total_objects) - > - count_partial(n, count_free); > + x =3D atomic_long_read(&n->total_inuse); > else > x =3D atomic_long_read(&n->nr_slabs); > total +=3D x; > @@ -5900,17 +5908,17 @@ void get_slabinfo(struct kmem_cache *s, struct = slabinfo *sinfo) > { > unsigned long nr_slabs =3D 0; > unsigned long nr_objs =3D 0; > - unsigned long nr_free =3D 0; > + unsigned long nr_inuse =3D 0; > int node; > struct kmem_cache_node *n; > =20 > for_each_kmem_cache_node(s, node, n) { > nr_slabs +=3D node_nr_slabs(n); > nr_objs +=3D node_nr_objs(n); > - nr_free +=3D count_partial(n, count_free); > + nr_inuse +=3D node_nr_inuse(n); > } > =20 > - sinfo->active_objs =3D nr_objs - nr_free; > + sinfo->active_objs =3D nr_inuse; > sinfo->num_objs =3D nr_objs; > sinfo->active_slabs =3D nr_slabs; > sinfo->num_slabs =3D nr_slabs; --------------F846028C26FC0153DF0D5EA7 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable


Hi,=C2=A0

I would greatly appreciate it =
if you kindly give me some feedback on this patch.

--

Best wishes,
Wen


On 2020/2/1 11:15 =E4=B8=8A=E5=8D=88, = Wen Yang wrote:
The lock, protecting the nod=
e partial list, is taken when couting the free
objects resident in that list. It introduces locking contention when the
page(s) is moved between CPU and node partial lists in allocation path
on another CPU. So reading "/proc/slabinfo" can possibily block the slab
allocation on another CPU for a while, 200ms in extreme cases. If the
slab object is to carry network packet, targeting the far-end disk array,
it causes block IO jitter issue.

This fixes the block IO jitter issue by caching the total inuse objects i=
n
the node in advance. The value is retrieved without taking the node parti=
al
list lock on reading "/proc/slabinfo".

Signed-off-by: Wen Yang <wenyang@linux.alibaba.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Xunlei Pang <xlpang@linux.alibaba.com>
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/slab.h |  1 +
 mm/slub.c | 42 +++++++++++++++++++++++++-----------------
 2 files changed, 26 insertions(+), 17 deletions(-)

diff --git a/mm/slab.h b/mm/slab.h
index 7e94700aa78c..27d22837f7ff 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -619,6 +619,7 @@ struct kmem_cache_node {
 #ifdef CONFIG_SLUB_DEBUG
 	atomic_long_t nr_slabs;
 	atomic_long_t total_objects;
+	atomic_long_t total_inuse;
 	struct list_head full;
 #endif
 #endif
diff --git a/mm/slub.c b/mm/slub.c
index 503e11b1c4e1..67640e797550 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1060,7 +1060,8 @@ static inline unsigned long node_nr_slabs(struct km=
em_cache_node *n)
 	return atomic_long_read(&n->nr_slabs);
 }
=20
-static inline void inc_slabs_node(struct kmem_cache *s, int node, int ob=
jects)
+static inline void inc_slabs_node(struct kmem_cache *s, int node, int ob=
jects,
+				  int inuse)
 {
 	struct kmem_cache_node *n =3D get_node(s, node);
=20
@@ -1073,14 +1074,17 @@ static inline void inc_slabs_node(struct kmem_cac=
he *s, int node, int objects)
 	if (likely(n)) {
 		atomic_long_inc(&n->nr_slabs);
 		atomic_long_add(objects, &n->total_objects);
+		atomic_long_add(inuse, &n->total_inuse);
 	}
 }
-static inline void dec_slabs_node(struct kmem_cache *s, int node, int ob=
jects)
+static inline void dec_slabs_node(struct kmem_cache *s, int node, int ob=
jects,
+				  int inuse)
 {
 	struct kmem_cache_node *n =3D get_node(s, node);
=20
 	atomic_long_dec(&n->nr_slabs);
 	atomic_long_sub(objects, &n->total_objects);
+	atomic_long_sub(inuse, &n->total_inuse);
 }
=20
 /* Object debug checks for alloc/free paths */
@@ -1395,9 +1399,11 @@ static inline unsigned long slabs_node(struct kmem=
_cache *s, int node)
 static inline unsigned long node_nr_slabs(struct kmem_cache_node *n)
 							{ return 0; }
 static inline void inc_slabs_node(struct kmem_cache *s, int node,
-							int objects) {}
+							int objects,
+							int inuse) {}
 static inline void dec_slabs_node(struct kmem_cache *s, int node,
-							int objects) {}
+							int objects,
+							int inuse) {}
=20
 #endif /* CONFIG_SLUB_DEBUG */
=20
@@ -1708,7 +1714,7 @@ static struct page *allocate_slab(struct kmem_cache=
 *s, gfp_t flags, int node)
 	if (!page)
 		return NULL;
=20
-	inc_slabs_node(s, page_to_nid(page), page->objects);
+	inc_slabs_node(s, page_to_nid(page), page->objects, page->inuse);
=20
 	return page;
 }
@@ -1768,7 +1774,9 @@ static void free_slab(struct kmem_cache *s, struct =
page *page)
=20
 static void discard_slab(struct kmem_cache *s, struct page *page)
 {
-	dec_slabs_node(s, page_to_nid(page), page->objects);
+	int inuse =3D page->objects;
+
+	dec_slabs_node(s, page_to_nid(page), page->objects, inuse);
 	free_slab(s, page);
 }
=20
@@ -2396,9 +2404,9 @@ static inline int node_match(struct page *page, int=
 node)
 }
=20
 #ifdef CONFIG_SLUB_DEBUG
-static int count_free(struct page *page)
+static inline unsigned long node_nr_inuse(struct kmem_cache_node *n)
 {
-	return page->objects - page->inuse;
+	return atomic_long_read(&n->total_inuse);
 }
=20
 static inline unsigned long node_nr_objs(struct kmem_cache_node *n)
@@ -2448,14 +2456,14 @@ slab_out_of_memory(struct kmem_cache *s, gfp_t gf=
pflags, int nid)
 	for_each_kmem_cache_node(s, node, n) {
 		unsigned long nr_slabs;
 		unsigned long nr_objs;
-		unsigned long nr_free;
+		unsigned long nr_inuse;
=20
-		nr_free  =3D count_partial(n, count_free);
 		nr_slabs =3D node_nr_slabs(n);
 		nr_objs  =3D node_nr_objs(n);
+		nr_inuse =3D node_nr_inuse(n);
=20
 		pr_warn("  node %d: slabs: %ld, objs: %ld, free: %ld\n",
-			node, nr_slabs, nr_objs, nr_free);
+			node, nr_slabs, nr_objs, nr_objs - nr_inuse);
 	}
 #endif
 }
@@ -3348,6 +3356,7 @@ init_kmem_cache_node(struct kmem_cache_node *n)
 #ifdef CONFIG_SLUB_DEBUG
 	atomic_long_set(&n->nr_slabs, 0);
 	atomic_long_set(&n->total_objects, 0);
+	atomic_long_set(&n->total_inuse, 0);
 	INIT_LIST_HEAD(&n->full);
 #endif
 }
@@ -3411,7 +3420,7 @@ static void early_kmem_cache_node_alloc(int node)
 	page->frozen =3D 0;
 	kmem_cache_node->node[node] =3D n;
 	init_kmem_cache_node(n);
-	inc_slabs_node(kmem_cache_node, node, page->objects);
+	inc_slabs_node(kmem_cache_node, node, page->objects, page->inuse)=
;
=20
 	/*
 	 * No locks need to be taken here as it has just been
@@ -4857,8 +4866,7 @@ static ssize_t show_slab_objects(struct kmem_cache =
*s,
 			if (flags & SO_TOTAL)
 				x =3D atomic_long_read(&n->total_objects);
 			else if (flags & SO_OBJECTS)
-				x =3D atomic_long_read(&n->total_objects) -
-					count_partial(n, count_free);
+				x =3D atomic_long_read(&n->total_inuse);
 			else
 				x =3D atomic_long_read(&n->nr_slabs);
 			total +=3D x;
@@ -5900,17 +5908,17 @@ void get_slabinfo(struct kmem_cache *s, struct sl=
abinfo *sinfo)
 {
 	unsigned long nr_slabs =3D 0;
 	unsigned long nr_objs =3D 0;
-	unsigned long nr_free =3D 0;
+	unsigned long nr_inuse =3D 0;
 	int node;
 	struct kmem_cache_node *n;
=20
 	for_each_kmem_cache_node(s, node, n) {
 		nr_slabs +=3D node_nr_slabs(n);
 		nr_objs +=3D node_nr_objs(n);
-		nr_free +=3D count_partial(n, count_free);
+		nr_inuse +=3D node_nr_inuse(n);
 	}
=20
-	sinfo->active_objs =3D nr_objs - nr_free;
+	sinfo->active_objs =3D nr_inuse;
 	sinfo->num_objs =3D nr_objs;
 	sinfo->active_slabs =3D nr_slabs;
 	sinfo->num_slabs =3D nr_slabs;
--------------F846028C26FC0153DF0D5EA7--