From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.8 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 89E85C3A5A5 for ; Tue, 3 Sep 2019 09:46:19 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5F76E22D6D for ; Tue, 3 Sep 2019 09:46:19 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="ENoNpDoj" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5F76E22D6D Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+infradead-linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:Cc:List-Subscribe: List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id:Message-Id:Date: Subject:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To: References:List-Owner; bh=SKRmNHfAJRR5noKYRfg+ZgZWuFfY47XDyfwvDOQzxpQ=; b=ENo NpDoju5yaw7muQPEPmYHUwUliWPZ6nS7FaqBCH8MkkZWi8o0l6RX1NjsErvU7IKhM6gESA5wSQpC1 ZlZvcb3HsNsm52yZXssCiOkZLg2oFVOcdPTRnIZHrlr0ctBjYv3A5TiLy89GIKd/oimQu6Zx1r0iW SwGqZGMkVKxJNQyW8cIgkEp3VNrDjRYOeq7QcooxySShJ5PYpLO+tsA8XapQKqTYPQkWGjH3veKlF dAQ76G5duym9PSYYmQP3riDhwmp/jUS9FEWBVuBqLSbQBEqwxH15qnFFy9j9Ae+7QLMcEHT9vAwz8 BOPbULTllBRYMX9TkSaeXLmesEqoEBg==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92 #3 (Red Hat Linux)) id 1i55O9-0007D5-2M; Tue, 03 Sep 2019 09:46:13 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.92 #3 (Red Hat Linux)) id 1i55Ny-00074P-0f for linux-arm-kernel@lists.infradead.org; Tue, 03 Sep 2019 09:46:03 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1F1C228; Tue, 3 Sep 2019 02:46:01 -0700 (PDT) Received: from p8cg001049571a15.blr.arm.com (p8cg001049571a15.blr.arm.com [10.162.42.170]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id CDC9C3F59C; Tue, 3 Sep 2019 02:45:53 -0700 (PDT) From: Anshuman Khandual To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, akpm@linux-foundation.org, catalin.marinas@arm.com, will@kernel.org Subject: [PATCH V7 0/3] arm64/mm: Enable memory hot remove Date: Tue, 3 Sep 2019 15:15:55 +0530 Message-Id: <1567503958-25831-1-git-send-email-anshuman.khandual@arm.com> X-Mailer: git-send-email 2.7.4 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20190903_024602_170695_75134600 X-CRM114-Status: GOOD ( 17.26 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mark.rutland@arm.com, mhocko@suse.com, david@redhat.com, ira.weiny@intel.com, steve.capper@arm.com, mgorman@techsingularity.net, steven.price@arm.com, broonie@kernel.org, cai@lca.pw, ard.biesheuvel@arm.com, cpandya@codeaurora.org, arunks@codeaurora.org, dan.j.williams@intel.com, Robin.Murphy@arm.com, logang@deltatee.com, valentin.schneider@arm.com, suzuki.poulose@arm.com, osalvador@suse.de MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+infradead-linux-arm-kernel=archiver.kernel.org@lists.infradead.org This series enables memory hot remove on arm64 after fixing a memblock removal ordering problem in generic try_remove_memory() and a possible arm64 platform specific kernel page table race condition. This series is based on the following current arm64 tree. git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git (for-next/core) Concurrent vmalloc() and hot-remove conflict: As pointed out earlier on the v5 thread [2] there can be potential conflict between concurrent vmalloc() and memory hot-remove operation. The problem here is caused by inadequate locking in vmalloc() which protects installation of a page table page but not the walk or the leaf entry modification. Lets not free page table pages if any for vmemmap mappings after unmapping it's virtual range if vmalloc and vmemap range overlaps. The only downside here is that some page table pages might remain empty and unused until next memory hot-add operation for the same memory range. But as mentioned earlier those will be minimal. Only the following two configurations have vmalloc and vmemmap overlap where free_empty_tables() function is not getting called. 1. 4K page size with 48 bit VA space 2. 16K page size with 48 bit VA space Testing: Memory hot remove has been tested on arm64 for 4K, 16K, 64K page config options with all possible CONFIG_ARM64_VA_BITS and CONFIG_PGTABLE_LEVELS combinations. It is only build tested on non-arm64 platforms. Changes in V7: - vmalloc_vmemmap_overlap gets evaluated early during boot for a given config - free_empty_tables() gets conditionally called based on vmalloc_vmemmap_overlap Changes in V6: (https://lkml.org/lkml/2019/7/15/36) - Implemented most of the suggestions from Mark Rutland - Added in ptdump - remove_pagetable() now has two distinct passes over the kernel page table - First pass unmap_hotplug_range() removes leaf level entries at all level - Second pass free_empty_tables() removes empty page table pages - Kernel page table lock has been dropped completely - vmemmap_free() does not call freee_empty_tables() to avoid conflict with vmalloc() - All address range scanning are converted to do {} while() loop - Added 'unsigned long end' in __remove_pgd_mapping() - Callers need not provide starting pointer argument to free_[pte|pmd|pud]_table() - Drop the starting pointer argument from free_[pte|pmd|pud]_table() functions - Fetching pxxp[i] in free_[pte|pmd|pud]_table() is wrapped around in READ_ONCE() - free_[pte|pmd|pud]_table() now computes starting pointer inside the function - Fixed TLB handling while freeing huge page section mappings at PMD or PUD level - Added WARN_ON(!page) in free_hotplug_page_range() - Added WARN_ON(![pm|pud]_table(pud|pmd)) when there is no section mapping - [PATCH 1/3] mm/hotplug: Reorder memblock_[free|remove]() calls in try_remove_memory() - Request earlier for separate merger (https://patchwork.kernel.org/patch/10986599/) - s/__remove_memory/try_remove_memory in the subject line - s/arch_remove_memory/memblock_[free|remove] in the subject line - A small change in the commit message as re-order happens now for memblock remove functions not for arch_remove_memory() Changes in V5: (https://lkml.org/lkml/2019/5/29/218) - Have some agreement [1] over using memory_hotplug_lock for arm64 ptdump - Change 7ba36eccb3f8 ("arm64/mm: Inhibit huge-vmap with ptdump") already merged - Dropped the above patch from this series - Fixed indentation problem in arch_[add|remove]_memory() as per David - Collected all new Acked-by tags Changes in V4: (https://lkml.org/lkml/2019/5/20/19) - Implemented most of the suggestions from Mark Rutland - Interchanged patch [PATCH 2/4] <---> [PATCH 3/4] and updated commit message - Moved CONFIG_PGTABLE_LEVELS inside free_[pud|pmd]_table() - Used READ_ONCE() in missing instances while accessing page table entries - s/p???_present()/p???_none() for checking valid kernel page table entries - WARN_ON() when an entry is !p???_none() and !p???_present() at the same time - Updated memory hot-remove commit message with additional details as suggested - Rebased the series on 5.2-rc1 with hotplug changes from David and Michal Hocko - Collected all new Acked-by tags Changes in V3: (https://lkml.org/lkml/2019/5/14/197) - Implemented most of the suggestions from Mark Rutland for remove_pagetable() - Fixed applicable PGTABLE_LEVEL wrappers around pgtable page freeing functions - Replaced 'direct' with 'sparse_vmap' in remove_pagetable() with inverted polarity - Changed pointer names ('p' at end) and removed tmp from iterations - Perform intermediate TLB invalidation while clearing pgtable entries - Dropped flush_tlb_kernel_range() in remove_pagetable() - Added flush_tlb_kernel_range() in remove_pte_table() instead - Renamed page freeing functions for pgtable page and mapped pages - Used page range size instead of order while freeing mapped or pgtable pages - Removed all PageReserved() handling while freeing mapped or pgtable pages - Replaced XXX_index() with XXX_offset() while walking the kernel page table - Used READ_ONCE() while fetching individual pgtable entries - Taken overall init_mm.page_table_lock instead of just while changing an entry - Dropped previously added [pmd|pud]_index() which are not required anymore - Added a new patch to protect kernel page table race condition for ptdump - Added a new patch from Mark Rutland to prevent huge-vmap with ptdump Changes in V2: (https://lkml.org/lkml/2019/4/14/5) - Added all received review and ack tags - Split the series from ZONE_DEVICE enablement for better review - Moved memblock re-order patch to the front as per Robin Murphy - Updated commit message on memblock re-order patch per Michal Hocko - Dropped [pmd|pud]_large() definitions - Used existing [pmd|pud]_sect() instead of earlier [pmd|pud]_large() - Removed __meminit and __ref tags as per Oscar Salvador - Dropped unnecessary 'ret' init in arch_add_memory() per Robin Murphy - Skipped calling into pgtable_page_dtor() for linear mapping page table pages and updated all relevant functions Changes in V1: (https://lkml.org/lkml/2019/4/3/28) References: [1] https://lkml.org/lkml/2019/5/28/584 [2] https://lkml.org/lkml/2019/6/11/709 Anshuman Khandual (3): mm/hotplug: Reorder memblock_[free|remove]() calls in try_remove_memory() arm64/mm: Hold memory hotplug lock while walking for kernel page table dump arm64/mm: Enable memory hot remove arch/arm64/Kconfig | 3 + arch/arm64/include/asm/memory.h | 1 + arch/arm64/mm/mmu.c | 338 +++++++++++++++++++++++++++++++- arch/arm64/mm/ptdump_debugfs.c | 4 + mm/memory_hotplug.c | 4 +- 5 files changed, 339 insertions(+), 11 deletions(-) -- 2.20.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel