From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D466BECAAD5 for ; Mon, 5 Sep 2022 12:02:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5B310801DC; Mon, 5 Sep 2022 08:02:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5610E8D0050; Mon, 5 Sep 2022 08:02:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 40216801DC; Mon, 5 Sep 2022 08:02:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 3128B8D0050 for ; Mon, 5 Sep 2022 08:02:43 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 07A2D1A0697 for ; Mon, 5 Sep 2022 12:02:43 +0000 (UTC) X-FDA: 79877895006.07.24E61D5 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by imf06.hostedemail.com (Postfix) with ESMTP id 66EB0180061 for ; Mon, 5 Sep 2022 12:02:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1662379362; x=1693915362; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=mr0Ged+XZKBkgCFReoHkcWU+1en7SbbyEMTFhTOQiPE=; b=MbBLRaQDEtW6BX97MR6VnVVhJeAcqkwKA1a/ns/xwHufB45l2ijj45R+ oXgWYP85iHEU/cDKKak7rgajVHRb1FtaFAzR49/9T0ZSal21Cg5NW38SN lZeA2YsRo9Ud8o4jhkHzXKsp+nafa+g68qCaMZYRycVrWcuKrmU8aWr6r CULT6J7jejkLP4Duicw/GbGLxfFupifoMzJzHQWo0xnbTjpcz972x/nad ESPHVfpp4HR+u/FQNIC+iggCuKxJCoIQ0HAPosCqAtwNDDyPz8YQXxhPl y52fdOr+fy91pXvICH8zN5fJHoqb6eBz30o2LbelXwXVjxd7TYUt2a8av w==; X-IronPort-AV: E=McAfee;i="6500,9779,10460"; a="382673451" X-IronPort-AV: E=Sophos;i="5.93,291,1654585200"; d="scan'208";a="382673451" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Sep 2022 05:02:40 -0700 X-IronPort-AV: E=Sophos;i="5.93,291,1654585200"; d="scan'208";a="643785985" Received: from jiebinsu-mobl.ccr.corp.intel.com (HELO [10.238.0.228]) ([10.238.0.228]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Sep 2022 05:02:36 -0700 Message-ID: <0dacdaac-a530-3499-b2ed-ee210c41ea1e@intel.com> Date: Mon, 5 Sep 2022 20:02:34 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.1.1 Subject: Re: [PATCH] ipc/msg.c: mitigate the lock contention with percpu counter Content-Language: en-US To: Shakeel Butt Cc: Andrew Morton , vasily.averin@linux.dev, Dennis Zhou , Tejun Heo , Christoph Lameter , "Eric W. Biederman" , Alexey Gladkov , Manfred Spraul , alexander.mikhalitsyn@virtuozzo.com, Linux MM , LKML , "Chen, Tim C" , Feng Tang , Huang Ying , tianyou.li@intel.com, wangyang.guo@intel.com, jiebin.sun@intel.com References: <20220902152243.479592-1-jiebin.sun@intel.com> From: "Sun, Jiebin" In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=MbBLRaQD; spf=softfail (imf06.hostedemail.com: 192.55.52.43 is neither permitted nor denied by domain of jiebin.sun@intel.com) smtp.mailfrom=jiebin.sun@intel.com; dmarc=fail reason="No valid SPF" header.from=intel.com (policy=none) ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1662379362; a=rsa-sha256; cv=none; b=RrppYZtIbDAIAHweK8VgQV3tRhWhPm2E0wRvobc8zUFhyumKOp8jVA6n0xILnZFBx44j/G Azrn6WSwLLmqXRItzKohUkULeCQAYZAgK7tVEQzD4CN3dQVuvjvwyoXY7aju/f+pZXtO5J 5fqKpJYkTOD2+V0RuHiNqXQ/Z3yE0ek= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1662379362; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=FDEEJ1MaGndAg25Xk2VgslpMmMiEFmvFTWtPEQSziW8=; b=MZKded1U55cP6YDh1udZu6Mtg3jF7b0K7lf8kae0xMhstVlOeLytJG46lLqEDDhkDlQZDQ WIDHKCxYpyMC0BhlqDp0eb7uIKbHoGOdqpsVkokGZWQVvWbmOgxdbXsnlSfl0k7Fx7PoQx 4LzXO7YjIbfJVM2VvTYM+kWw3GdxGGk= X-Rspamd-Server: rspam02 X-Rspam-User: Authentication-Results: imf06.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=MbBLRaQD; spf=softfail (imf06.hostedemail.com: 192.55.52.43 is neither permitted nor denied by domain of jiebin.sun@intel.com) smtp.mailfrom=jiebin.sun@intel.com; dmarc=fail reason="No valid SPF" header.from=intel.com (policy=none) X-Stat-Signature: 97r3akyfgxazjk49kgmgmfcurupu5n8q X-Rspamd-Queue-Id: 66EB0180061 X-HE-Tag: 1662379362-627319 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 9/3/2022 12:27 AM, Shakeel Butt wrote: > On Fri, Sep 2, 2022 at 12:04 AM Jiebin Sun wrote: >> The msg_bytes and msg_hdrs atomic counters are frequently >> updated when IPC msg queue is in heavy use, causing heavy >> cache bounce and overhead. Change them to percpu_counters >> greatly improve the performance. Since there is one unique >> ipc namespace, additional memory cost is minimal. Reading >> of the count done in msgctl call, which is infrequent. So >> the need to sum up the counts in each CPU is infrequent. >> >> Apply the patch and test the pts/stress-ng-1.4.0 >> -- system v message passing (160 threads). >> >> Score gain: 3.38x >> >> CPU: ICX 8380 x 2 sockets >> Core number: 40 x 2 physical cores >> Benchmark: pts/stress-ng-1.4.0 >> -- system v message passing (160 threads) >> >> Signed-off-by: Jiebin Sun > [...] >> +void percpu_counter_add_local(struct percpu_counter *fbc, s64 amount) >> +{ >> + this_cpu_add(*fbc->counters, amount); >> +} >> +EXPORT_SYMBOL(percpu_counter_add_local); > Why not percpu_counter_add()? This may drift the fbc->count more than > batch*nr_cpus. I am assuming that is not the issue for you as you > always do an expensive sum in the slow path. As Andrew asked, this > should be a separate patch. Yes. It will always do sum in msgctl_info. So there is no need to do global updating in percpu_counter_add when the percpu counter reaches the batch size. We add percpu_counter_add_local in this case. The sum in slow path is infrequent. So the additional cost is much less compared to the atomic updating in do_msgsnd and do_msgrcv every time. I have separate the original patch into two patches. Thanks.