From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 294DBC4742C for ; Wed, 11 Nov 2020 09:28:32 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9FC6B2072C for ; Wed, 11 Nov 2020 09:28:29 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9FC6B2072C Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B7D3E6B006C; Wed, 11 Nov 2020 04:28:25 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6D9236B0074; Wed, 11 Nov 2020 04:28:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3C54E6B006E; Wed, 11 Nov 2020 04:28:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0237.hostedemail.com [216.40.44.237]) by kanga.kvack.org (Postfix) with ESMTP id F084B6B0068 for ; Wed, 11 Nov 2020 04:28:24 -0500 (EST) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 9EB9B180AD802 for ; Wed, 11 Nov 2020 09:28:24 +0000 (UTC) X-FDA: 77471611728.25.shape15_1d07e20272fc Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin25.hostedemail.com (Postfix) with ESMTP id 62F011804E3A1 for ; Wed, 11 Nov 2020 09:28:24 +0000 (UTC) X-HE-Tag: shape15_1d07e20272fc X-Filterd-Recvd-Size: 4934 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf45.hostedemail.com (Postfix) with ESMTP for ; Wed, 11 Nov 2020 09:28:23 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 5C293ABD6; Wed, 11 Nov 2020 09:28:22 +0000 (UTC) From: Vlastimil Babka To: Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Michal Hocko , Pavel Tatashin , David Hildenbrand , Oscar Salvador , Joonsoo Kim , Vlastimil Babka Subject: [PATCH v3 0/7] disable pcplists during memory offline Date: Wed, 11 Nov 2020 10:28:05 +0100 Message-Id: <20201111092812.11329-1-vbabka@suse.cz> X-Mailer: git-send-email 2.29.1 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Changes since v2 [8]: - add acks/reviews (thanks David and Oscar) - small wording and style changes - rebase to next-20201111 Changes since v1 [7]: - add acks/reviews (thanks David and Michal) - drop "mm, page_alloc: make per_cpu_pageset accessible only after init" = as that's orthogonal and needs more consideration - squash "mm, page_alloc: drain all pcplists during memory offline" into = the last patch, and move new zone_pcp_* functions into mm/page_alloc. As su= ch, the new 'force all cpus' param of __drain_all_pages() is never exported outside page_alloc.c so I didn't add a new wrapper function to hide the= bool - keep pcp_batch_high_lock a mutex as offline_pages is synchronized anywa= y, as suggested by Michal. Thus we don't need atomic variable and sync aro= und it, and patch is much smaller. If alloc_contic_range() wants to use the= new functionality and keep parallelism, we can add that on top. As per the discussions [1] [2] this is an attempt to implement David's suggestion that page isolation should disable pcplists to avoid races wit= h page freeing in progress. This is done without extra checks in fast paths, as explained in Patch 9. The repeated draining done by [2] is then no longer needed. Previous version (RFC) is at [3]. The RFC tried to hide pcplists disabling/enabling into page isolation, bu= t it wasn't completely possible, as memory offline does not unisolation. Micha= l suggested an explicit API in [4] so that's the current implementation and= it seems indeed nicer. Once we accept that page isolation users need to do explicit actions arou= nd it depending on the needed guarantees, we can also IMHO accept that the curr= ent pcplist draining can be also done by the callers, which is more effective= . After all, there are only two users of page isolation. So patch 6 does effectively the same thing as Pavel proposed in [5], and patch 7 implemen= t stronger guarantees only for memory offline. If CMA decides to opt-in to = the stronger guarantee, it can be added later. Patches 1-5 are preparatory cleanups for pcplist disabling. Patchset was briefly tested in QEMU so that memory online/offline works, = but I haven't done a stress test that would prove the race fixed by [2] is eliminated. Note that patch 7 could be avoided if we instead adjusted page freeing in= shown in [6], but I believe the current implementation of disabling pcplists is= not too much complex, so I would prefer this instead of adding new checks and longer irq-disabled section into page freeing hotpaths. [1] https://lore.kernel.org/linux-mm/20200901124615.137200-1-pasha.tatash= in@soleen.com/ [2] https://lore.kernel.org/linux-mm/20200903140032.380431-1-pasha.tatash= in@soleen.com/ [3] https://lore.kernel.org/linux-mm/20200907163628.26495-1-vbabka@suse.c= z/ [4] https://lore.kernel.org/linux-mm/20200909113647.GG7348@dhcp22.suse.cz= / [5] https://lore.kernel.org/linux-mm/20200904151448.100489-3-pasha.tatash= in@soleen.com/ [6] https://lore.kernel.org/linux-mm/3d3b53db-aeaa-ff24-260b-36427fac9b1c= @suse.cz/ [7] https://lore.kernel.org/linux-mm/20200922143712.12048-1-vbabka@suse.c= z/ [8] https://lore.kernel.org/linux-mm/20201008114201.18824-1-vbabka@suse.c= z/ Vlastimil Babka (7): mm, page_alloc: clean up pageset high and batch update mm, page_alloc: calculate pageset high and batch once per zone mm, page_alloc: remove setup_pageset() mm, page_alloc: simplify pageset_update() mm, page_alloc: cache pageset high and batch in struct zone mm, page_alloc: move draining pcplists to page isolation users mm, page_alloc: disable pcplists during memory offline include/linux/mmzone.h | 6 ++ mm/internal.h | 2 + mm/memory_hotplug.c | 27 +++--- mm/page_alloc.c | 195 ++++++++++++++++++++++++----------------- mm/page_isolation.c | 10 +-- 5 files changed, 141 insertions(+), 99 deletions(-) --=20 2.29.1