From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.2 required=3.0 tests=BAYES_00,BODY_ENHANCEMENT, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING, NICE_REPLY_A,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 77DCFC4727D for ; Tue, 22 Sep 2020 09:01:08 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E60B5206F7 for ; Tue, 22 Sep 2020 09:01:07 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E60B5206F7 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 3EBD090003C; Tue, 22 Sep 2020 05:01:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 375D0900036; Tue, 22 Sep 2020 05:01:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 23F7890003C; Tue, 22 Sep 2020 05:01:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0186.hostedemail.com [216.40.44.186]) by kanga.kvack.org (Postfix) with ESMTP id 01835900036 for ; Tue, 22 Sep 2020 05:01:06 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id B04EB52BF for ; Tue, 22 Sep 2020 09:01:06 +0000 (UTC) X-FDA: 77290102932.22.look00_5516fb32714c Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin22.hostedemail.com (Postfix) with ESMTP id 8F77D18038E68 for ; Tue, 22 Sep 2020 09:01:06 +0000 (UTC) X-HE-Tag: look00_5516fb32714c X-Filterd-Recvd-Size: 5212 Received: from out30-42.freemail.mail.aliyun.com (out30-42.freemail.mail.aliyun.com [115.124.30.42]) by imf27.hostedemail.com (Postfix) with ESMTP for ; Tue, 22 Sep 2020 09:01:02 +0000 (UTC) X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R161e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04357;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=23;SR=0;TI=SMTPD_---0U9lgfBQ_1600765256; Received: from IT-FVFX43SYHV2H.local(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0U9lgfBQ_1600765256) by smtp.aliyun-inc.com(127.0.0.1); Tue, 22 Sep 2020 17:00:57 +0800 Subject: Re: [PATCH v18 20/32] mm/lru: replace pgdat lru_lock with lruvec lock To: Hugh Dickins Cc: akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, willy@infradead.org, hannes@cmpxchg.org, lkp@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, shakeelb@google.com, iamjoonsoo.kim@lge.com, richard.weiyang@gmail.com, kirill@shutemov.name, alexander.duyck@gmail.com, rong.a.chen@intel.com, mhocko@suse.com, vdavydov.dev@gmail.com, shy828301@gmail.com, Michal Hocko , Yang Shi References: <1598273705-69124-1-git-send-email-alex.shi@linux.alibaba.com> <1598273705-69124-21-git-send-email-alex.shi@linux.alibaba.com> From: Alex Shi Message-ID: <0ae8709b-b82a-0956-9934-cbbd1f2e50ce@linux.alibaba.com> Date: Tue, 22 Sep 2020 16:58:47 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=gbk Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: =D4=DA 2020/9/22 =CF=C2=CE=E71:27, Hugh Dickins =D0=B4=B5=C0: > On Mon, 24 Aug 2020, Alex Shi wrote: >=20 >> This patch moves per node lru_lock into lruvec, thus bring a lru_lock = for >> each of memcg per node. So on a large machine, each of memcg don't >> have to suffer from per node pgdat->lru_lock competition. They could g= o >> fast with their self lru_lock. >> >> After move memcg charge before lru inserting, page isolation could >> serialize page's memcg, then per memcg lruvec lock is stable and could >> replace per node lru lock. >> >> In func isolate_migratepages_block, compact_unlock_should_abort is >> opend, and lock_page_lruvec logical is embedded for tight process. >=20 > Hard to understand: perhaps: >=20 > In func isolate_migratepages_block, compact_unlock_should_abort and > lock_page_lruvec_irqsave are open coded to work with compact_control. will update with your suggestion. Thanks! >=20 >> Also add a debug func in locking which may give some clues if there ar= e >> sth out of hands. >> >> According to Daniel Jordan's suggestion, I run 208 'dd' with on 104 >> containers on a 2s * 26cores * HT box with a modefied case: >> https://git.kernel.org/pub/scm/linux/kernel/git/wfg/vm-scalability.git= /tree/case-lru-file-readtwice >=20 > s/modeified/modified/ > lruv19 has an lkml.org link there, please substitut > https://lore.kernel.org/lkml/01ed6e45-3853-dcba-61cb-b429a49a7572@linux= .alibaba.com/ >=20 Thanks! >> >> With this and later patches, the readtwice performance increases >> about 80% within concurrent containers. >> >> On a large machine with memcg enabled but not used, the page's lruvec >> seeking pass a few pointers, that may lead to lru_lock holding time >> increase and a bit regression. >> >> Hugh Dickins helped on patch polish, thanks! >> >> Reported-by: kernel test robot >=20 > Eh? It may have reported some locking bugs somewhere, but this > is the main patch of your per-memcg lru_lock: I don't think the > kernel test robot inspired your whole design, did it? Delete that. >=20 >=20 >> Signed-off-by: Alex Shi >=20 > I can't quite Ack this one yet, because there are several functions > (mainly __munlock_pagevec and check_move_unevictable_pages) which are > not right in this v18 version, and a bit tricky to correct: I already > suggested what to do in other mail, but this patch comes before > relock_page_lruvec, so must look different from the final result; > I need to look at a later version, perhaps already there in your > github tree, before I can Ack: but it's not far off. > Comments below. All suggestions are taken! Many thanks for so detailed review! A new branch with all comments is updated as=20 https://github.com/alexshi/linux.git lruv19.5 A quick summary for the branch, Add a new patch for move_pages_to_lru: mm/vmscan: remove lruvec reget in move_pages_to_lru Add another patch for split part from 'Introduce TestClearPageLRU': mm/swap.c: reorder __ClearPageLRU and lruvec the mlock changes moved earlier: mm/mlock: remove __munlock_isolate_lru_page mm/mlock: remove __munlock_isolate_lru_page I am wondering if it's good to send out v19 here or maybe better to wait for your confirm if all suggestion/comments are settled? Thanks Alex