From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.3 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,URIBL_DBL_ABUSE_MALW autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2BA8C35247 for ; Tue, 4 Feb 2020 01:36:46 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5B6A920732 for ; Tue, 4 Feb 2020 01:36:46 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="NcW6pnvr" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5B6A920732 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 081F86B02B1; Mon, 3 Feb 2020 20:36:46 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 059126B02B3; Mon, 3 Feb 2020 20:36:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EB1366B02B4; Mon, 3 Feb 2020 20:36:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0095.hostedemail.com [216.40.44.95]) by kanga.kvack.org (Postfix) with ESMTP id CCFB46B02B1 for ; Mon, 3 Feb 2020 20:36:45 -0500 (EST) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 7AC472493 for ; Tue, 4 Feb 2020 01:36:45 +0000 (UTC) X-FDA: 76450730370.07.dog62_1dd36dde03d4f X-HE-Tag: dog62_1dd36dde03d4f X-Filterd-Recvd-Size: 11114 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf33.hostedemail.com (Postfix) with ESMTP for ; Tue, 4 Feb 2020 01:36:44 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 236982166E; Tue, 4 Feb 2020 01:36:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1580780204; bh=/JupTJVCbp+HACyUi65b+hxB+4Z/Lr85XcVGHkmbiR8=; h=Date:From:To:Subject:In-Reply-To:From; b=NcW6pnvrxfdY/sndlNtFh0Zd2M4jhe/NXOMNc/wdxXxwd1a9Ytd4j69zRlBP/cgvk MNE2wFrov/sRfRVDviL/e9xV7KyXjSf4fl79MdfJUqav6B8/lC74ibKeR84w9YZ+5v g9ogI3fCIwg2ioNqW55SdUk84++xzk4tNDI3W6y8= Date: Mon, 03 Feb 2020 17:36:42 -0800 From: Andrew Morton To: akpm@linux-foundation.org, alex@ghiti.fr, aou@eecs.berkeley.edu, ard.biesheuvel@linaro.org, arnd@arndb.de, benh@kernel.crashing.org, borntraeger@de.ibm.com, bp@alien8.de, catalin.marinas@arm.com, dave.hansen@linux.intel.com, davem@davemloft.net, gor@linux.ibm.com, heiko.carstens@de.ibm.com, hpa@zytor.com, james.morse@arm.com, jglisse@redhat.com, jhogan@kernel.org, kan.liang@linux.intel.com, linux-mm@kvack.org, linux@armlinux.org.uk, luto@kernel.org, mark.rutland@arm.com, mingo@redhat.com, mm-commits@vger.kernel.org, mpe@ellerman.id.au, paul.burton@mips.com, paul.walmsley@sifive.com, paulus@samba.org, peterz@infradead.org, ralf@linux-mips.org, sfr@canb.auug.org.au, steven.price@arm.com, tglx@linutronix.de, torvalds@linux-foundation.org, vgupta@synopsys.com, will@kernel.org, zong.li@sifive.com Subject: [patch 47/67] x86: mm: avoid allocating struct mm_struct on the stack Message-ID: <20200204013642.91mpRPzlx%akpm@linux-foundation.org> In-Reply-To: <20200203173311.6269a8be06a05e5a4aa08a93@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Steven Price Subject: x86: mm: avoid allocating struct mm_struct on the stack struct mm_struct is quite large (~1664 bytes) and so allocating on the stack may cause problems as the kernel stack size is small. Since ptdump_walk_pgd_level_core() was only allocating the structure so that it could modify the pgd argument we can instead introduce a pgd override in struct mm_walk and pass this down the call stack to where it is needed. Since the correct mm_struct is now being passed down, it is now also unnecessary to take the mmap_sem semaphore because ptdump_walk_pgd() will now take the semaphore on the real mm. [steven.price@arm.com: restore missed arm64 changes] Link: http://lkml.kernel.org/r/20200108145710.34314-1-steven.price@arm.com Link: http://lkml.kernel.org/r/20200108145710.34314-1-steven.price@arm.com Signed-off-by: Steven Price Reported-by: Stephen Rothwell Cc: Catalin Marinas Cc: Albert Ou Cc: Alexandre Ghiti Cc: Andy Lutomirski Cc: Ard Biesheuvel Cc: Arnd Bergmann Cc: Benjamin Herrenschmidt Cc: Borislav Petkov Cc: Christian Borntraeger Cc: Dave Hansen Cc: David S. Miller Cc: Heiko Carstens Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: James Hogan Cc: James Morse Cc: Jerome Glisse Cc: "Liang, Kan" Cc: Mark Rutland Cc: Michael Ellerman Cc: Paul Burton Cc: Paul Mackerras Cc: Paul Walmsley Cc: Peter Zijlstra Cc: Ralf Baechle Cc: Russell King Cc: Thomas Gleixner Cc: Vasily Gorbik Cc: Vineet Gupta Cc: Will Deacon Cc: Zong Li Signed-off-by: Andrew Morton --- arch/arm64/mm/dump.c | 4 ++-- arch/x86/mm/debug_pagetables.c | 10 ++-------- arch/x86/mm/dump_pagetables.c | 18 +++++++----------- include/linux/pagewalk.h | 3 +++ include/linux/ptdump.h | 2 +- mm/pagewalk.c | 7 ++++++- mm/ptdump.c | 4 ++-- 7 files changed, 23 insertions(+), 25 deletions(-) --- a/arch/arm64/mm/dump.c~x86-mm-avoid-allocating-struct-mm_struct-on-the-stack +++ a/arch/arm64/mm/dump.c @@ -323,7 +323,7 @@ void ptdump_walk(struct seq_file *s, str } }; - ptdump_walk_pgd(&st.ptdump, info->mm); + ptdump_walk_pgd(&st.ptdump, info->mm, NULL); } static void ptdump_initialize(void) @@ -361,7 +361,7 @@ void ptdump_check_wx(void) } }; - ptdump_walk_pgd(&st.ptdump, &init_mm); + ptdump_walk_pgd(&st.ptdump, &init_mm, NULL); if (st.wx_pages || st.uxn_pages) pr_warn("Checked W+X mappings: FAILED, %lu W+X pages found, %lu non-UXN pages found\n", --- a/arch/x86/mm/debug_pagetables.c~x86-mm-avoid-allocating-struct-mm_struct-on-the-stack +++ a/arch/x86/mm/debug_pagetables.c @@ -15,11 +15,8 @@ DEFINE_SHOW_ATTRIBUTE(ptdump); static int ptdump_curknl_show(struct seq_file *m, void *v) { - if (current->mm->pgd) { - down_read(¤t->mm->mmap_sem); + if (current->mm->pgd) ptdump_walk_pgd_level_debugfs(m, current->mm, false); - up_read(¤t->mm->mmap_sem); - } return 0; } @@ -28,11 +25,8 @@ DEFINE_SHOW_ATTRIBUTE(ptdump_curknl); #ifdef CONFIG_PAGE_TABLE_ISOLATION static int ptdump_curusr_show(struct seq_file *m, void *v) { - if (current->mm->pgd) { - down_read(¤t->mm->mmap_sem); + if (current->mm->pgd) ptdump_walk_pgd_level_debugfs(m, current->mm, true); - up_read(¤t->mm->mmap_sem); - } return 0; } --- a/arch/x86/mm/dump_pagetables.c~x86-mm-avoid-allocating-struct-mm_struct-on-the-stack +++ a/arch/x86/mm/dump_pagetables.c @@ -357,7 +357,8 @@ static void note_page(struct ptdump_stat } } -static void ptdump_walk_pgd_level_core(struct seq_file *m, pgd_t *pgd, +static void ptdump_walk_pgd_level_core(struct seq_file *m, + struct mm_struct *mm, pgd_t *pgd, bool checkwx, bool dmesg) { const struct ptdump_range ptdump_ranges[] = { @@ -386,12 +387,7 @@ static void ptdump_walk_pgd_level_core(s .seq = m }; - struct mm_struct fake_mm = { - .pgd = pgd - }; - init_rwsem(&fake_mm.mmap_sem); - - ptdump_walk_pgd(&st.ptdump, &fake_mm); + ptdump_walk_pgd(&st.ptdump, mm, pgd); if (!checkwx) return; @@ -404,7 +400,7 @@ static void ptdump_walk_pgd_level_core(s void ptdump_walk_pgd_level(struct seq_file *m, struct mm_struct *mm) { - ptdump_walk_pgd_level_core(m, mm->pgd, false, true); + ptdump_walk_pgd_level_core(m, mm, mm->pgd, false, true); } void ptdump_walk_pgd_level_debugfs(struct seq_file *m, struct mm_struct *mm, @@ -415,7 +411,7 @@ void ptdump_walk_pgd_level_debugfs(struc if (user && boot_cpu_has(X86_FEATURE_PTI)) pgd = kernel_to_user_pgdp(pgd); #endif - ptdump_walk_pgd_level_core(m, pgd, false, false); + ptdump_walk_pgd_level_core(m, mm, pgd, false, false); } EXPORT_SYMBOL_GPL(ptdump_walk_pgd_level_debugfs); @@ -430,13 +426,13 @@ void ptdump_walk_user_pgd_level_checkwx( pr_info("x86/mm: Checking user space page tables\n"); pgd = kernel_to_user_pgdp(pgd); - ptdump_walk_pgd_level_core(NULL, pgd, true, false); + ptdump_walk_pgd_level_core(NULL, &init_mm, pgd, true, false); #endif } void ptdump_walk_pgd_level_checkwx(void) { - ptdump_walk_pgd_level_core(NULL, INIT_PGD, true, false); + ptdump_walk_pgd_level_core(NULL, &init_mm, INIT_PGD, true, false); } static int __init pt_dump_init(void) --- a/include/linux/pagewalk.h~x86-mm-avoid-allocating-struct-mm_struct-on-the-stack +++ a/include/linux/pagewalk.h @@ -74,6 +74,7 @@ enum page_walk_action { * mm_walk - walk_page_range data * @ops: operation to call during the walk * @mm: mm_struct representing the target process of page table walk + * @pgd: pointer to PGD; only valid with no_vma (otherwise set to NULL) * @vma: vma currently walked (NULL if walking outside vmas) * @action: next action to perform (see enum page_walk_action) * @no_vma: walk ignoring vmas (vma will always be NULL) @@ -84,6 +85,7 @@ enum page_walk_action { struct mm_walk { const struct mm_walk_ops *ops; struct mm_struct *mm; + pgd_t *pgd; struct vm_area_struct *vma; enum page_walk_action action; bool no_vma; @@ -95,6 +97,7 @@ int walk_page_range(struct mm_struct *mm void *private); int walk_page_range_novma(struct mm_struct *mm, unsigned long start, unsigned long end, const struct mm_walk_ops *ops, + pgd_t *pgd, void *private); int walk_page_vma(struct vm_area_struct *vma, const struct mm_walk_ops *ops, void *private); --- a/include/linux/ptdump.h~x86-mm-avoid-allocating-struct-mm_struct-on-the-stack +++ a/include/linux/ptdump.h @@ -17,6 +17,6 @@ struct ptdump_state { const struct ptdump_range *range; }; -void ptdump_walk_pgd(struct ptdump_state *st, struct mm_struct *mm); +void ptdump_walk_pgd(struct ptdump_state *st, struct mm_struct *mm, pgd_t *pgd); #endif /* _LINUX_PTDUMP_H */ --- a/mm/pagewalk.c~x86-mm-avoid-allocating-struct-mm_struct-on-the-stack +++ a/mm/pagewalk.c @@ -206,7 +206,10 @@ static int walk_pgd_range(unsigned long const struct mm_walk_ops *ops = walk->ops; int err = 0; - pgd = pgd_offset(walk->mm, addr); + if (walk->pgd) + pgd = walk->pgd + pgd_index(addr); + else + pgd = pgd_offset(walk->mm, addr); do { next = pgd_addr_end(addr, end); if (pgd_none_or_clear_bad(pgd)) { @@ -436,11 +439,13 @@ int walk_page_range(struct mm_struct *mm */ int walk_page_range_novma(struct mm_struct *mm, unsigned long start, unsigned long end, const struct mm_walk_ops *ops, + pgd_t *pgd, void *private) { struct mm_walk walk = { .ops = ops, .mm = mm, + .pgd = pgd, .private = private, .no_vma = true }; --- a/mm/ptdump.c~x86-mm-avoid-allocating-struct-mm_struct-on-the-stack +++ a/mm/ptdump.c @@ -122,14 +122,14 @@ static const struct mm_walk_ops ptdump_o .pte_hole = ptdump_hole, }; -void ptdump_walk_pgd(struct ptdump_state *st, struct mm_struct *mm) +void ptdump_walk_pgd(struct ptdump_state *st, struct mm_struct *mm, pgd_t *pgd) { const struct ptdump_range *range = st->range; down_read(&mm->mmap_sem); while (range->start != range->end) { walk_page_range_novma(mm, range->start, range->end, - &ptdump_ops, st); + &ptdump_ops, pgd, st); range++; } up_read(&mm->mmap_sem); _