From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7006CECE587 for ; Tue, 1 Oct 2019 14:40:36 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3AB322054F for ; Tue, 1 Oct 2019 14:40:36 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3AB322054F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id AAC338E0006; Tue, 1 Oct 2019 10:40:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A5B398E0001; Tue, 1 Oct 2019 10:40:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 94ABB8E0006; Tue, 1 Oct 2019 10:40:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0165.hostedemail.com [216.40.44.165]) by kanga.kvack.org (Postfix) with ESMTP id 7221E8E0001 for ; Tue, 1 Oct 2019 10:40:35 -0400 (EDT) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id 2DE106131 for ; Tue, 1 Oct 2019 14:40:35 +0000 (UTC) X-FDA: 75995476830.16.knee05_797347700d80a X-HE-Tag: knee05_797347700d80a X-Filterd-Recvd-Size: 8127 Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by imf18.hostedemail.com (Postfix) with ESMTP for ; Tue, 1 Oct 2019 14:40:34 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 233012101; Tue, 1 Oct 2019 14:40:32 +0000 (UTC) Received: from t460s.redhat.com (ovpn-116-54.ams2.redhat.com [10.36.116.54]) by smtp.corp.redhat.com (Postfix) with ESMTP id B08FB5D9D5; Tue, 1 Oct 2019 14:40:12 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, David Hildenbrand , "Aneesh Kumar K . V" , Andrew Morton , Dan Williams , Michal Hocko , Alexander Duyck , Alexander Potapenko , Andy Lutomirski , Anshuman Khandual , Benjamin Herrenschmidt , Borislav Petkov , Catalin Marinas , Christian Borntraeger , Christophe Leroy , Dave Hansen , Fenghua Yu , Gerald Schaefer , Greg Kroah-Hartman , Halil Pasic , Heiko Carstens , "H. Peter Anvin" , Ingo Molnar , Ira Weiny , Jason Gunthorpe , Jun Yao , Logan Gunthorpe , Mark Rutland , Masahiro Yamada , "Matthew Wilcox (Oracle)" , Mel Gorman , Michael Ellerman , Mike Rapoport , Oscar Salvador , Pankaj Gupta , Paul Mackerras , Pavel Tatashin , Pavel Tatashin , Peter Zijlstra , Qian Cai , Rich Felker , Robin Murphy , Steve Capper , Thomas Gleixner , Tom Lendacky , Tony Luck , Vasily Gorbik , Vlastimil Babka , Wei Yang , Wei Yang , Will Deacon , Yoshinori Sato , Yu Zhao Subject: [PATCH v5 00/10] mm/memory_hotplug: Shrink zones before removing memory Date: Tue, 1 Oct 2019 16:40:01 +0200 Message-Id: <20191001144011.3801-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2 (mx1.redhat.com [10.5.110.71]); Tue, 01 Oct 2019 14:40:33 +0000 (UTC) Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This series fixes the access of uninitialized memmaps when shrinking zones/nodes and when removing memory. Also, it contains all fixes for crashes that can be triggered when removing certain namespace using memunmap_pages() - ZONE_DEVICE, reported by Aneesh. We stop trying to shrink ZONE_DEVICE, as it's buggy, fixing it would be more involved (we don't have SECTION_IS_ONLINE as an indicator), and shrinking is only of limited use (set_zone_contiguous() cannot detect the ZONE_DEVICE as contiguous). We continue shrinking !ZONE_DEVICE zones, however, I reduced the amount o= f code to a minimum. Shrinking is especially necessary to keep zone->contiguous set where possible, especially, on memory unplug of DIMMs at zone boundaries. -------------------------------------------------------------------------= - Zones are now properly shrunk when offlining memory blocks or when onlining failed. This allows to properly shrink zones on memory unplug even if the separate memory blocks of a DIMM were onlined to different zones or re-onlined to a different zone after offlining. Example: :/# cat /proc/zoneinfo Node 1, zone Movable spanned 0 present 0 managed 0 :/# echo "online_movable" > /sys/devices/system/memory/memory41/state :/# echo "online_movable" > /sys/devices/system/memory/memory43/state :/# cat /proc/zoneinfo Node 1, zone Movable spanned 98304 present 65536 managed 65536 :/# echo 0 > /sys/devices/system/memory/memory43/online :/# cat /proc/zoneinfo Node 1, zone Movable spanned 32768 present 32768 managed 32768 :/# echo 0 > /sys/devices/system/memory/memory41/online :/# cat /proc/zoneinfo Node 1, zone Movable spanned 0 present 0 managed 0 -------------------------------------------------------------------------= - I tested this with DIMMs on x86. It sounded like Aneesh tested the ZONE_DEVICE part :) v4 -> v5: - "mm/memory_hotplug: Don't access uninitialized memmaps in shrink_zone_s= pan()" -- Add more details why ZONE_DEVICE is special - Include two patches from Aneesh -- "mm/memunmap: Use the correct start and end pfn when removing pages from zone" -- "mm/memmap_init: Update variable name in memmap_init_zone" v3 -> v4: - Drop "mm/memremap: Get rid of memmap_init_zone_device()" -- As Alexander noticed, it was messy either way - Drop "mm/memory_hotplug: Exit early in __remove_pages() on BUGs" - Drop "mm: Exit early in set_zone_contiguous() if already contiguous" - Drop "mm/memory_hotplug: Optimize zone shrinking code when checking for holes" - Merged "mm/memory_hotplug: Remove pages from a zone before removing memory" and "mm/memory_hotplug: Remove zone parameter from __remove_pages()" into "mm/memory_hotplug: Shrink zones when offlining memory" - Added "mm/memory_hotplug: Poison memmap in remove_pfn_range_from_zone()= " - Stop shrinking ZONE_DEVICE - Reshuffle patches, moving all fixes to the front. Add Fixes: tags. - Change subject/description of various patches - Minor changes (too many to mention) Cc: Aneesh Kumar K.V Cc: Andrew Morton Cc: Dan Williams Cc: Michal Hocko Aneesh Kumar K.V (2): mm/memunmap: Use the correct start and end pfn when removing pages from zone mm/memmap_init: Update variable name in memmap_init_zone David Hildenbrand (8): mm/memory_hotplug: Don't access uninitialized memmaps in shrink_pgdat_span() mm/memory_hotplug: Don't access uninitialized memmaps in shrink_zone_span() mm/memory_hotplug: Shrink zones when offlining memory mm/memory_hotplug: Poison memmap in remove_pfn_range_from_zone() mm/memory_hotplug: We always have a zone in find_(smallest|biggest)_section_pfn mm/memory_hotplug: Don't check for "all holes" in shrink_zone_span() mm/memory_hotplug: Drop local variables in shrink_zone_span() mm/memory_hotplug: Cleanup __remove_pages() arch/arm64/mm/mmu.c | 4 +- arch/ia64/mm/init.c | 4 +- arch/powerpc/mm/mem.c | 3 +- arch/s390/mm/init.c | 4 +- arch/sh/mm/init.c | 4 +- arch/x86/mm/init_32.c | 4 +- arch/x86/mm/init_64.c | 4 +- include/linux/memory_hotplug.h | 7 +- mm/memory_hotplug.c | 184 +++++++++++---------------------- mm/memremap.c | 14 ++- mm/page_alloc.c | 8 +- 11 files changed, 88 insertions(+), 152 deletions(-) --=20 2.21.0 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 525E8ECE587 for ; Tue, 1 Oct 2019 14:46:12 +0000 (UTC) Received: from lists.ozlabs.org (lists.ozlabs.org [203.11.71.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A7F8C20815 for ; Tue, 1 Oct 2019 14:46:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A7F8C20815 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Received: from bilbo.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 46jMXJ19nwzDqSw for ; Wed, 2 Oct 2019 00:46:08 +1000 (AEST) Authentication-Results: lists.ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=redhat.com (client-ip=209.132.183.28; helo=mx1.redhat.com; envelope-from=david@redhat.com; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=pass (p=none dis=none) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 46jMPy3qVGzDqRP for ; Wed, 2 Oct 2019 00:40:35 +1000 (AEST) Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 233012101; Tue, 1 Oct 2019 14:40:32 +0000 (UTC) Received: from t460s.redhat.com (ovpn-116-54.ams2.redhat.com [10.36.116.54]) by smtp.corp.redhat.com (Postfix) with ESMTP id B08FB5D9D5; Tue, 1 Oct 2019 14:40:12 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Subject: [PATCH v5 00/10] mm/memory_hotplug: Shrink zones before removing memory Date: Tue, 1 Oct 2019 16:40:01 +0200 Message-Id: <20191001144011.3801-1-david@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2 (mx1.redhat.com [10.5.110.71]); Tue, 01 Oct 2019 14:40:33 +0000 (UTC) X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mark Rutland , Pankaj Gupta , Michal Hocko , linux-ia64@vger.kernel.org, linux-sh@vger.kernel.org, Peter Zijlstra , Catalin Marinas , Dave Hansen , Heiko Carstens , Wei Yang , linux-mm@kvack.org, Pavel Tatashin , Rich Felker , Alexander Potapenko , "H. Peter Anvin" , Alexander Duyck , Ira Weiny , Thomas Gleixner , Qian Cai , linux-s390@vger.kernel.org, Yu Zhao , Yoshinori Sato , Jason Gunthorpe , "Aneesh Kumar K . V" , David Hildenbrand , "Matthew Wilcox \(Oracle\)" , Mike Rapoport , Halil Pasic , Christian Borntraeger , Ingo Molnar , Gerald Schaefer , Wei Yang , Fenghua Yu , Pavel Tatashin , Vasily Gorbik , Anshuman Khandual , Vlastimil Babka , Will Deacon , Robin Murphy , Jun Yao , Borislav Petkov , Andy Lutomirski , Dan Williams , linux-arm-kernel@lists.infradead.org, Oscar Salvador , Tony Luck , Masahiro Yamada , Greg Kroah-Hartman , Steve Capper , Mel Gorman , Logan Gunthorpe , Tom Lendacky , Paul Mackerras , Andrew Morton , linuxppc-dev@lists.ozlabs.org Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" This series fixes the access of uninitialized memmaps when shrinking zones/nodes and when removing memory. Also, it contains all fixes for crashes that can be triggered when removing certain namespace using memunmap_pages() - ZONE_DEVICE, reported by Aneesh. We stop trying to shrink ZONE_DEVICE, as it's buggy, fixing it would be more involved (we don't have SECTION_IS_ONLINE as an indicator), and shrinking is only of limited use (set_zone_contiguous() cannot detect the ZONE_DEVICE as contiguous). We continue shrinking !ZONE_DEVICE zones, however, I reduced the amount of code to a minimum. Shrinking is especially necessary to keep zone->contiguous set where possible, especially, on memory unplug of DIMMs at zone boundaries. -------------------------------------------------------------------------- Zones are now properly shrunk when offlining memory blocks or when onlining failed. This allows to properly shrink zones on memory unplug even if the separate memory blocks of a DIMM were onlined to different zones or re-onlined to a different zone after offlining. Example: :/# cat /proc/zoneinfo Node 1, zone Movable spanned 0 present 0 managed 0 :/# echo "online_movable" > /sys/devices/system/memory/memory41/state :/# echo "online_movable" > /sys/devices/system/memory/memory43/state :/# cat /proc/zoneinfo Node 1, zone Movable spanned 98304 present 65536 managed 65536 :/# echo 0 > /sys/devices/system/memory/memory43/online :/# cat /proc/zoneinfo Node 1, zone Movable spanned 32768 present 32768 managed 32768 :/# echo 0 > /sys/devices/system/memory/memory41/online :/# cat /proc/zoneinfo Node 1, zone Movable spanned 0 present 0 managed 0 -------------------------------------------------------------------------- I tested this with DIMMs on x86. It sounded like Aneesh tested the ZONE_DEVICE part :) v4 -> v5: - "mm/memory_hotplug: Don't access uninitialized memmaps in shrink_zone_span()" -- Add more details why ZONE_DEVICE is special - Include two patches from Aneesh -- "mm/memunmap: Use the correct start and end pfn when removing pages from zone" -- "mm/memmap_init: Update variable name in memmap_init_zone" v3 -> v4: - Drop "mm/memremap: Get rid of memmap_init_zone_device()" -- As Alexander noticed, it was messy either way - Drop "mm/memory_hotplug: Exit early in __remove_pages() on BUGs" - Drop "mm: Exit early in set_zone_contiguous() if already contiguous" - Drop "mm/memory_hotplug: Optimize zone shrinking code when checking for holes" - Merged "mm/memory_hotplug: Remove pages from a zone before removing memory" and "mm/memory_hotplug: Remove zone parameter from __remove_pages()" into "mm/memory_hotplug: Shrink zones when offlining memory" - Added "mm/memory_hotplug: Poison memmap in remove_pfn_range_from_zone()" - Stop shrinking ZONE_DEVICE - Reshuffle patches, moving all fixes to the front. Add Fixes: tags. - Change subject/description of various patches - Minor changes (too many to mention) Cc: Aneesh Kumar K.V Cc: Andrew Morton Cc: Dan Williams Cc: Michal Hocko Aneesh Kumar K.V (2): mm/memunmap: Use the correct start and end pfn when removing pages from zone mm/memmap_init: Update variable name in memmap_init_zone David Hildenbrand (8): mm/memory_hotplug: Don't access uninitialized memmaps in shrink_pgdat_span() mm/memory_hotplug: Don't access uninitialized memmaps in shrink_zone_span() mm/memory_hotplug: Shrink zones when offlining memory mm/memory_hotplug: Poison memmap in remove_pfn_range_from_zone() mm/memory_hotplug: We always have a zone in find_(smallest|biggest)_section_pfn mm/memory_hotplug: Don't check for "all holes" in shrink_zone_span() mm/memory_hotplug: Drop local variables in shrink_zone_span() mm/memory_hotplug: Cleanup __remove_pages() arch/arm64/mm/mmu.c | 4 +- arch/ia64/mm/init.c | 4 +- arch/powerpc/mm/mem.c | 3 +- arch/s390/mm/init.c | 4 +- arch/sh/mm/init.c | 4 +- arch/x86/mm/init_32.c | 4 +- arch/x86/mm/init_64.c | 4 +- include/linux/memory_hotplug.h | 7 +- mm/memory_hotplug.c | 184 +++++++++++---------------------- mm/memremap.c | 14 ++- mm/page_alloc.c | 8 +- 11 files changed, 88 insertions(+), 152 deletions(-) -- 2.21.0