From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BD86EC56202 for ; Wed, 25 Nov 2020 15:38:22 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 22ACE21D81 for ; Wed, 25 Nov 2020 15:38:22 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 22ACE21D81 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 7B6C86B006E; Wed, 25 Nov 2020 10:38:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 73FD36B0071; Wed, 25 Nov 2020 10:38:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 607606B0083; Wed, 25 Nov 2020 10:38:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0189.hostedemail.com [216.40.44.189]) by kanga.kvack.org (Postfix) with ESMTP id 4771A6B006E for ; Wed, 25 Nov 2020 10:38:21 -0500 (EST) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 16C6C1EE6 for ; Wed, 25 Nov 2020 15:38:21 +0000 (UTC) X-FDA: 77523347202.07.curve88_4a13b9d27377 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin07.hostedemail.com (Postfix) with ESMTP id EBA7B1803F9AA for ; Wed, 25 Nov 2020 15:38:20 +0000 (UTC) X-HE-Tag: curve88_4a13b9d27377 X-Filterd-Recvd-Size: 4874 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf29.hostedemail.com (Postfix) with ESMTP for ; Wed, 25 Nov 2020 15:38:20 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id E5C3FAC22; Wed, 25 Nov 2020 15:38:18 +0000 (UTC) To: Alex Shi Cc: Konstantin Khlebnikov , Andrew Morton , Hugh Dickins , Yu Zhao , Michal Hocko , linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <1605860847-47445-1-git-send-email-alex.shi@linux.alibaba.com> From: Vlastimil Babka Subject: Re: [PATCH next] mm/swap.c: reduce lock contention in lru_cache_add Message-ID: Date: Wed, 25 Nov 2020 16:38:18 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.5.0 MIME-Version: 1.0 In-Reply-To: <1605860847-47445-1-git-send-email-alex.shi@linux.alibaba.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 11/20/20 9:27 AM, Alex Shi wrote: > The current relock logical will change lru_lock when found a new > lruvec, so if 2 memcgs are reading file or alloc page at same time, > they could hold the lru_lock alternately, and wait for each other for > fairness attribute of ticket spin lock. >=20 > This patch will sort that all lru_locks and only hold them once in > above scenario. That could reduce fairness waiting for lock reget. > Than, vm-scalability/case-lru-file-readtwice could get ~5% performance > gain on my 2P*20core*HT machine. Hm, once you sort the pages like this, it's a shame not to splice them=20 instead of more list_del() + list_add() iterations. update_lru_size()=20 could be also called once? > Suggested-by: Konstantin Khlebnikov > Signed-off-by: Alex Shi > Cc: Konstantin Khlebnikov > Cc: Andrew Morton > Cc: Hugh Dickins > Cc: Yu Zhao > Cc: Michal Hocko > Cc: linux-mm@kvack.org > Cc: linux-kernel@vger.kernel.org > --- > mm/swap.c | 57 +++++++++++++++++++++++++++++++++++++++++++++++-------= - > 1 file changed, 49 insertions(+), 8 deletions(-) >=20 > diff --git a/mm/swap.c b/mm/swap.c > index 490553f3f9ef..c787b38bf9c0 100644 > --- a/mm/swap.c > +++ b/mm/swap.c > @@ -1009,24 +1009,65 @@ static void __pagevec_lru_add_fn(struct page *p= age, struct lruvec *lruvec) > trace_mm_lru_insertion(page, lru); > } > =20 > +struct lruvecs { > + struct list_head lists[PAGEVEC_SIZE]; > + struct lruvec *vecs[PAGEVEC_SIZE]; > +}; > + > +/* Sort pvec pages on their lruvec */ > +int sort_page_lruvec(struct lruvecs *lruvecs, struct pagevec *pvec) > +{ > + int i, j, nr_lruvec; > + struct page *page; > + struct lruvec *lruvec =3D NULL; > + > + lruvecs->vecs[0] =3D NULL; > + for (i =3D nr_lruvec =3D 0; i < pagevec_count(pvec); i++) { > + page =3D pvec->pages[i]; > + lruvec =3D mem_cgroup_page_lruvec(page, page_pgdat(page)); > + > + /* Try to find a same lruvec */ > + for (j =3D 0; j <=3D nr_lruvec; j++) > + if (lruvec =3D=3D lruvecs->vecs[j]) > + break; > + > + /* A new lruvec */ > + if (j > nr_lruvec) { > + INIT_LIST_HEAD(&lruvecs->lists[nr_lruvec]); > + lruvecs->vecs[nr_lruvec] =3D lruvec; > + j =3D nr_lruvec++; > + lruvecs->vecs[nr_lruvec] =3D 0; > + } > + > + list_add_tail(&page->lru, &lruvecs->lists[j]); > + } > + > + return nr_lruvec; > +} > + > /* > * Add the passed pages to the LRU, then drop the caller's refcount > * on them. Reinitialises the caller's pagevec. > */ > void __pagevec_lru_add(struct pagevec *pvec) > { > - int i; > - struct lruvec *lruvec =3D NULL; > + int i, nr_lruvec; > unsigned long flags =3D 0; > + struct page *page; > + struct lruvecs lruvecs; > =20 > - for (i =3D 0; i < pagevec_count(pvec); i++) { > - struct page *page =3D pvec->pages[i]; > + nr_lruvec =3D sort_page_lruvec(&lruvecs, pvec); > =20 > - lruvec =3D relock_page_lruvec_irqsave(page, lruvec, &flags); > - __pagevec_lru_add_fn(page, lruvec); > + for (i =3D 0; i < nr_lruvec; i++) { > + spin_lock_irqsave(&lruvecs.vecs[i]->lru_lock, flags); > + while (!list_empty(&lruvecs.lists[i])) { > + page =3D lru_to_page(&lruvecs.lists[i]); > + list_del(&page->lru); > + __pagevec_lru_add_fn(page, lruvecs.vecs[i]); > + } > + spin_unlock_irqrestore(&lruvecs.vecs[i]->lru_lock, flags); > } > - if (lruvec) > - unlock_page_lruvec_irqrestore(lruvec, flags); > + > release_pages(pvec->pages, pvec->nr); > pagevec_reinit(pvec); > } >=20