From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752137AbdHGUkb (ORCPT ); Mon, 7 Aug 2017 16:40:31 -0400 Received: from userp1040.oracle.com ([156.151.31.81]:42089 "EHLO userp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752012AbdHGUk1 (ORCPT ); Mon, 7 Aug 2017 16:40:27 -0400 From: Pavel Tatashin To: linux-kernel@vger.kernel.org, sparclinux@vger.kernel.org, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-arm-kernel@lists.infradead.org, x86@kernel.org, kasan-dev@googlegroups.com, borntraeger@de.ibm.com, heiko.carstens@de.ibm.com, davem@davemloft.net, willy@infradead.org, mhocko@kernel.org, ard.biesheuvel@linaro.org, will.deacon@arm.com, catalin.marinas@arm.com, sam@ravnborg.org Subject: [v6 00/15] complete deferred page initialization Date: Mon, 7 Aug 2017 16:38:34 -0400 Message-Id: <1502138329-123460-1-git-send-email-pasha.tatashin@oracle.com> X-Mailer: git-send-email 1.7.1 X-Source-IP: userv0021.oracle.com [156.151.31.71] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Changelog: v6 - v4 - Fixed ARM64 + kasan code, as reported by Ard Biesheuvel - Tested ARM64 code in qemu and found few more issues, that I fixed in this iteration - Added page roundup/rounddown to x86 and arm zeroing routines to zero the whole allocated range, instead of only provided address range. - Addressed SPARC related comment from Sam Ravnborg - Fixed section mismatch warnings related to memblock_discard(). v5 - v4 - Fixed build issues reported by kbuild on various configurations v4 - v3 - Rewrote code to zero sturct pages in __init_single_page() as suggested by Michal Hocko - Added code to handle issues related to accessing struct page memory before they are initialized. v3 - v2 - Addressed David Miller comments about one change per patch: * Splited changes to platforms into 4 patches * Made "do not zero vmemmap_buf" as a separate patch v2 - v1 - Per request, added s390 to deferred "struct page" zeroing - Collected performance data on x86 which proofs the importance to keep memset() as prefetch (see below). SMP machines can benefit from the DEFERRED_STRUCT_PAGE_INIT config option, which defers initializing struct pages until all cpus have been started so it can be done in parallel. However, this feature is sub-optimal, because the deferred page initialization code expects that the struct pages have already been zeroed, and the zeroing is done early in boot with a single thread only. Also, we access that memory and set flags before struct pages are initialized. All of this is fixed in this patchset. In this work we do the following: - Never read access struct page until it was initialized - Never set any fields in struct pages before they are initialized - Zero struct page at the beginning of struct page initialization Performance improvements on x86 machine with 8 nodes: Intel(R) Xeon(R) CPU E7-8895 v3 @ 2.60GHz Single threaded struct page init: 7.6s/T improvement Deferred struct page init: 10.2s/T improvement Pavel Tatashin (15): x86/mm: reserve only exiting low pages x86/mm: setting fields in deferred pages sparc64/mm: setting fields in deferred pages mm: discard memblock data later mm: don't accessed uninitialized struct pages sparc64: simplify vmemmap_populate mm: defining memblock_virt_alloc_try_nid_raw mm: zero struct pages during initialization sparc64: optimized struct page zeroing x86/kasan: explicitly zero kasan shadow memory arm64/kasan: explicitly zero kasan shadow memory mm: explicitly zero pagetable memory mm: stop zeroing memory during allocation in vmemmap mm: optimize early system hash allocations mm: debug for raw alloctor arch/arm64/mm/kasan_init.c | 42 ++++++++++ arch/sparc/include/asm/pgtable_64.h | 30 +++++++ arch/sparc/mm/init_64.c | 31 +++----- arch/x86/kernel/setup.c | 5 +- arch/x86/mm/init_64.c | 9 ++- arch/x86/mm/kasan_init_64.c | 67 ++++++++++++++++ include/linux/bootmem.h | 27 +++++++ include/linux/memblock.h | 9 ++- include/linux/mm.h | 9 +++ mm/memblock.c | 152 ++++++++++++++++++++++++++++-------- mm/nobootmem.c | 16 ---- mm/page_alloc.c | 31 +++++--- mm/sparse-vmemmap.c | 10 ++- mm/sparse.c | 6 +- 14 files changed, 356 insertions(+), 88 deletions(-) -- 2.14.0 From mboxrd@z Thu Jan 1 00:00:00 1970 From: Pavel Tatashin Date: Mon, 07 Aug 2017 20:38:34 +0000 Subject: [v6 00/15] complete deferred page initialization Message-Id: <1502138329-123460-1-git-send-email-pasha.tatashin@oracle.com> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: linux-arm-kernel@lists.infradead.org Changelog: v6 - v4 - Fixed ARM64 + kasan code, as reported by Ard Biesheuvel - Tested ARM64 code in qemu and found few more issues, that I fixed in this iteration - Added page roundup/rounddown to x86 and arm zeroing routines to zero the whole allocated range, instead of only provided address range. - Addressed SPARC related comment from Sam Ravnborg - Fixed section mismatch warnings related to memblock_discard(). v5 - v4 - Fixed build issues reported by kbuild on various configurations v4 - v3 - Rewrote code to zero sturct pages in __init_single_page() as suggested by Michal Hocko - Added code to handle issues related to accessing struct page memory before they are initialized. v3 - v2 - Addressed David Miller comments about one change per patch: * Splited changes to platforms into 4 patches * Made "do not zero vmemmap_buf" as a separate patch v2 - v1 - Per request, added s390 to deferred "struct page" zeroing - Collected performance data on x86 which proofs the importance to keep memset() as prefetch (see below). SMP machines can benefit from the DEFERRED_STRUCT_PAGE_INIT config option, which defers initializing struct pages until all cpus have been started so it can be done in parallel. However, this feature is sub-optimal, because the deferred page initialization code expects that the struct pages have already been zeroed, and the zeroing is done early in boot with a single thread only. Also, we access that memory and set flags before struct pages are initialized. All of this is fixed in this patchset. In this work we do the following: - Never read access struct page until it was initialized - Never set any fields in struct pages before they are initialized - Zero struct page at the beginning of struct page initialization Performance improvements on x86 machine with 8 nodes: Intel(R) Xeon(R) CPU E7-8895 v3 @ 2.60GHz Single threaded struct page init: 7.6s/T improvement Deferred struct page init: 10.2s/T improvement Pavel Tatashin (15): x86/mm: reserve only exiting low pages x86/mm: setting fields in deferred pages sparc64/mm: setting fields in deferred pages mm: discard memblock data later mm: don't accessed uninitialized struct pages sparc64: simplify vmemmap_populate mm: defining memblock_virt_alloc_try_nid_raw mm: zero struct pages during initialization sparc64: optimized struct page zeroing x86/kasan: explicitly zero kasan shadow memory arm64/kasan: explicitly zero kasan shadow memory mm: explicitly zero pagetable memory mm: stop zeroing memory during allocation in vmemmap mm: optimize early system hash allocations mm: debug for raw alloctor arch/arm64/mm/kasan_init.c | 42 ++++++++++ arch/sparc/include/asm/pgtable_64.h | 30 +++++++ arch/sparc/mm/init_64.c | 31 +++----- arch/x86/kernel/setup.c | 5 +- arch/x86/mm/init_64.c | 9 ++- arch/x86/mm/kasan_init_64.c | 67 ++++++++++++++++ include/linux/bootmem.h | 27 +++++++ include/linux/memblock.h | 9 ++- include/linux/mm.h | 9 +++ mm/memblock.c | 152 ++++++++++++++++++++++++++++-------- mm/nobootmem.c | 16 ---- mm/page_alloc.c | 31 +++++--- mm/sparse-vmemmap.c | 10 ++- mm/sparse.c | 6 +- 14 files changed, 356 insertions(+), 88 deletions(-) -- 2.14.0 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qt0-f197.google.com (mail-qt0-f197.google.com [209.85.216.197]) by kanga.kvack.org (Postfix) with ESMTP id 1405B6B02C3 for ; Mon, 7 Aug 2017 16:39:46 -0400 (EDT) Received: by mail-qt0-f197.google.com with SMTP id 6so6640005qts.7 for ; Mon, 07 Aug 2017 13:39:46 -0700 (PDT) Received: from userp1040.oracle.com (userp1040.oracle.com. [156.151.31.81]) by mx.google.com with ESMTPS id t5si8041503qtd.471.2017.08.07.13.39.44 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 07 Aug 2017 13:39:44 -0700 (PDT) From: Pavel Tatashin Subject: [v6 00/15] complete deferred page initialization Date: Mon, 7 Aug 2017 16:38:34 -0400 Message-Id: <1502138329-123460-1-git-send-email-pasha.tatashin@oracle.com> Sender: owner-linux-mm@kvack.org List-ID: To: linux-kernel@vger.kernel.org, sparclinux@vger.kernel.org, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-arm-kernel@lists.infradead.org, x86@kernel.org, kasan-dev@googlegroups.com, borntraeger@de.ibm.com, heiko.carstens@de.ibm.com, davem@davemloft.net, willy@infradead.org, mhocko@kernel.org, ard.biesheuvel@linaro.org, will.deacon@arm.com, catalin.marinas@arm.com, sam@ravnborg.org Changelog: v6 - v4 - Fixed ARM64 + kasan code, as reported by Ard Biesheuvel - Tested ARM64 code in qemu and found few more issues, that I fixed in this iteration - Added page roundup/rounddown to x86 and arm zeroing routines to zero the whole allocated range, instead of only provided address range. - Addressed SPARC related comment from Sam Ravnborg - Fixed section mismatch warnings related to memblock_discard(). v5 - v4 - Fixed build issues reported by kbuild on various configurations v4 - v3 - Rewrote code to zero sturct pages in __init_single_page() as suggested by Michal Hocko - Added code to handle issues related to accessing struct page memory before they are initialized. v3 - v2 - Addressed David Miller comments about one change per patch: * Splited changes to platforms into 4 patches * Made "do not zero vmemmap_buf" as a separate patch v2 - v1 - Per request, added s390 to deferred "struct page" zeroing - Collected performance data on x86 which proofs the importance to keep memset() as prefetch (see below). SMP machines can benefit from the DEFERRED_STRUCT_PAGE_INIT config option, which defers initializing struct pages until all cpus have been started so it can be done in parallel. However, this feature is sub-optimal, because the deferred page initialization code expects that the struct pages have already been zeroed, and the zeroing is done early in boot with a single thread only. Also, we access that memory and set flags before struct pages are initialized. All of this is fixed in this patchset. In this work we do the following: - Never read access struct page until it was initialized - Never set any fields in struct pages before they are initialized - Zero struct page at the beginning of struct page initialization Performance improvements on x86 machine with 8 nodes: Intel(R) Xeon(R) CPU E7-8895 v3 @ 2.60GHz Single threaded struct page init: 7.6s/T improvement Deferred struct page init: 10.2s/T improvement Pavel Tatashin (15): x86/mm: reserve only exiting low pages x86/mm: setting fields in deferred pages sparc64/mm: setting fields in deferred pages mm: discard memblock data later mm: don't accessed uninitialized struct pages sparc64: simplify vmemmap_populate mm: defining memblock_virt_alloc_try_nid_raw mm: zero struct pages during initialization sparc64: optimized struct page zeroing x86/kasan: explicitly zero kasan shadow memory arm64/kasan: explicitly zero kasan shadow memory mm: explicitly zero pagetable memory mm: stop zeroing memory during allocation in vmemmap mm: optimize early system hash allocations mm: debug for raw alloctor arch/arm64/mm/kasan_init.c | 42 ++++++++++ arch/sparc/include/asm/pgtable_64.h | 30 +++++++ arch/sparc/mm/init_64.c | 31 +++----- arch/x86/kernel/setup.c | 5 +- arch/x86/mm/init_64.c | 9 ++- arch/x86/mm/kasan_init_64.c | 67 ++++++++++++++++ include/linux/bootmem.h | 27 +++++++ include/linux/memblock.h | 9 ++- include/linux/mm.h | 9 +++ mm/memblock.c | 152 ++++++++++++++++++++++++++++-------- mm/nobootmem.c | 16 ---- mm/page_alloc.c | 31 +++++--- mm/sparse-vmemmap.c | 10 ++- mm/sparse.c | 6 +- 14 files changed, 356 insertions(+), 88 deletions(-) -- 2.14.0 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: pasha.tatashin@oracle.com (Pavel Tatashin) Date: Mon, 7 Aug 2017 16:38:34 -0400 Subject: [v6 00/15] complete deferred page initialization Message-ID: <1502138329-123460-1-git-send-email-pasha.tatashin@oracle.com> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org Changelog: v6 - v4 - Fixed ARM64 + kasan code, as reported by Ard Biesheuvel - Tested ARM64 code in qemu and found few more issues, that I fixed in this iteration - Added page roundup/rounddown to x86 and arm zeroing routines to zero the whole allocated range, instead of only provided address range. - Addressed SPARC related comment from Sam Ravnborg - Fixed section mismatch warnings related to memblock_discard(). v5 - v4 - Fixed build issues reported by kbuild on various configurations v4 - v3 - Rewrote code to zero sturct pages in __init_single_page() as suggested by Michal Hocko - Added code to handle issues related to accessing struct page memory before they are initialized. v3 - v2 - Addressed David Miller comments about one change per patch: * Splited changes to platforms into 4 patches * Made "do not zero vmemmap_buf" as a separate patch v2 - v1 - Per request, added s390 to deferred "struct page" zeroing - Collected performance data on x86 which proofs the importance to keep memset() as prefetch (see below). SMP machines can benefit from the DEFERRED_STRUCT_PAGE_INIT config option, which defers initializing struct pages until all cpus have been started so it can be done in parallel. However, this feature is sub-optimal, because the deferred page initialization code expects that the struct pages have already been zeroed, and the zeroing is done early in boot with a single thread only. Also, we access that memory and set flags before struct pages are initialized. All of this is fixed in this patchset. In this work we do the following: - Never read access struct page until it was initialized - Never set any fields in struct pages before they are initialized - Zero struct page at the beginning of struct page initialization Performance improvements on x86 machine with 8 nodes: Intel(R) Xeon(R) CPU E7-8895 v3 @ 2.60GHz Single threaded struct page init: 7.6s/T improvement Deferred struct page init: 10.2s/T improvement Pavel Tatashin (15): x86/mm: reserve only exiting low pages x86/mm: setting fields in deferred pages sparc64/mm: setting fields in deferred pages mm: discard memblock data later mm: don't accessed uninitialized struct pages sparc64: simplify vmemmap_populate mm: defining memblock_virt_alloc_try_nid_raw mm: zero struct pages during initialization sparc64: optimized struct page zeroing x86/kasan: explicitly zero kasan shadow memory arm64/kasan: explicitly zero kasan shadow memory mm: explicitly zero pagetable memory mm: stop zeroing memory during allocation in vmemmap mm: optimize early system hash allocations mm: debug for raw alloctor arch/arm64/mm/kasan_init.c | 42 ++++++++++ arch/sparc/include/asm/pgtable_64.h | 30 +++++++ arch/sparc/mm/init_64.c | 31 +++----- arch/x86/kernel/setup.c | 5 +- arch/x86/mm/init_64.c | 9 ++- arch/x86/mm/kasan_init_64.c | 67 ++++++++++++++++ include/linux/bootmem.h | 27 +++++++ include/linux/memblock.h | 9 ++- include/linux/mm.h | 9 +++ mm/memblock.c | 152 ++++++++++++++++++++++++++++-------- mm/nobootmem.c | 16 ---- mm/page_alloc.c | 31 +++++--- mm/sparse-vmemmap.c | 10 ++- mm/sparse.c | 6 +- 14 files changed, 356 insertions(+), 88 deletions(-) -- 2.14.0