From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.5 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, UNPARSEABLE_RELAY,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7377DC433E0 for ; Fri, 3 Jul 2020 09:38:02 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3BE272073E for ; Fri, 3 Jul 2020 09:38:02 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3BE272073E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9E9986B00B1; Fri, 3 Jul 2020 05:38:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 99B426B00B2; Fri, 3 Jul 2020 05:38:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8B18E6B00B3; Fri, 3 Jul 2020 05:38:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0251.hostedemail.com [216.40.44.251]) by kanga.kvack.org (Postfix) with ESMTP id 71F4B6B00B1 for ; Fri, 3 Jul 2020 05:38:01 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 15963DB4E for ; Fri, 3 Jul 2020 09:38:01 +0000 (UTC) X-FDA: 76996263162.05.spade34_2c0c48726e90 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin05.hostedemail.com (Postfix) with ESMTP id E05711802EAB5 for ; Fri, 3 Jul 2020 09:38:00 +0000 (UTC) X-HE-Tag: spade34_2c0c48726e90 X-Filterd-Recvd-Size: 4391 Received: from out30-131.freemail.mail.aliyun.com (out30-131.freemail.mail.aliyun.com [115.124.30.131]) by imf05.hostedemail.com (Postfix) with ESMTP for ; Fri, 3 Jul 2020 09:37:59 +0000 (UTC) X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R611e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01f04427;MF=xlpang@linux.alibaba.com;NM=1;PH=DS;RN=8;SR=0;TI=SMTPD_---0U1ZX3m._1593769072; Received: from xunleideMacBook-Pro.local(mailfrom:xlpang@linux.alibaba.com fp:SMTPD_---0U1ZX3m._1593769072) by smtp.aliyun-inc.com(127.0.0.1); Fri, 03 Jul 2020 17:37:53 +0800 Reply-To: xlpang@linux.alibaba.com Subject: Re: [PATCH 1/2] mm/slub: Introduce two counters for the partial objects To: Pekka Enberg Cc: Christoph Lameter , Andrew Morton , Wen Yang , Yang Shi , Roman Gushchin , "linux-mm@kvack.org" , LKML References: <1593678728-128358-1-git-send-email-xlpang@linux.alibaba.com> From: xunlei Message-ID: <7374a9fd-460b-1a51-1ab4-25170337e5f2@linux.alibaba.com> Date: Fri, 3 Jul 2020 17:37:52 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:68.0) Gecko/20100101 Thunderbird/68.9.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: E05711802EAB5 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam01 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2020/7/2 PM 7:59, Pekka Enberg wrote: > On Thu, Jul 2, 2020 at 11:32 AM Xunlei Pang wrote: >> The node list_lock in count_partial() spend long time iterating >> in case of large amount of partial page lists, which can cause >> thunder herd effect to the list_lock contention, e.g. it cause >> business response-time jitters when accessing "/proc/slabinfo" >> in our production environments. > > Would you have any numbers to share to quantify this jitter? I have no We have HSF RT(High-speed Service Framework Response-Time) monitors, the RT figures fluctuated randomly, then we deployed a tool detecting "irq off" and "preempt off" to dump the culprit's calltrace, capturing the list_lock cost up to 100ms with irq off issued by "ss", this also caused network timeouts. > objections to this approach, but I think the original design > deliberately made reading "/proc/slabinfo" more expensive to avoid > atomic operations in the allocation/deallocation paths. It would be > good to understand what is the gain of this approach before we switch > to it. Maybe even run some slab-related benchmark (not sure if there's > something better than hackbench these days) to see if the overhead of > this approach shows up. I thought that before, but most atomic operations are serialized by the list_lock. Another possible way is to hold list_lock in __slab_free(), then these two counters can be changed from atomic to long. I also have no idea what's the standard SLUB benchmark for the regression test, any specific suggestion? > >> This patch introduces two counters to maintain the actual number >> of partial objects dynamically instead of iterating the partial >> page lists with list_lock held. >> >> New counters of kmem_cache_node are: pfree_objects, ptotal_objects. >> The main operations are under list_lock in slow path, its performance >> impact is minimal. >> >> Co-developed-by: Wen Yang >> Signed-off-by: Xunlei Pang >> --- >> mm/slab.h | 2 ++ >> mm/slub.c | 38 +++++++++++++++++++++++++++++++++++++- >> 2 files changed, 39 insertions(+), 1 deletion(-) >> >> diff --git a/mm/slab.h b/mm/slab.h >> index 7e94700..5935749 100644 >> --- a/mm/slab.h >> +++ b/mm/slab.h >> @@ -616,6 +616,8 @@ struct kmem_cache_node { >> #ifdef CONFIG_SLUB >> unsigned long nr_partial; >> struct list_head partial; >> + atomic_long_t pfree_objects; /* partial free objects */ >> + atomic_long_t ptotal_objects; /* partial total objects */ > > You could rename these to "nr_partial_free_objs" and > "nr_partial_total_objs" for readability. Sounds good. Thanks! > > - Pekka >