From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 82E74C2D0A3 for ; Mon, 26 Oct 2020 08:40:15 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id EB7A3223B0 for ; Mon, 26 Oct 2020 08:40:14 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="YQYOlFrh"; dkim=fail reason="signature verification failed" (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="i7+DYdgO" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EB7A3223B0 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To:Message-Id:Date: Subject:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=94UHM5ExHoGTbcvTiJx8L9cSewFHxdUhMd25W2SsdaQ=; b=YQYOlFrhWnHgPAfW4wgPnGtpT Cvq8mdkv/9LnMGv/DkgsD9y9NNKZiQdeoBOwfBWfkgXPzpjFHs4/fCZDGFnFqQltZGWNwqv88fgv5 +9OQ3rZ/QBIMs3tAsr/C8t+IE053zcjJkTsSJ7EwAZCriSR7+yLBH980lLtRgeehkLJozc0KhOaVw +4uHyCThAXq4OOeY+c/wUH44byz0/Yuc01uH6FQSB+t3/IaUAjicYtWcnoH4Au+xTD4qIT8RhAqE6 9RJu/xkEiv4vzXi5eQsL3bXYrQCm3wg4urCIe8v1hkK6ai5oGdAb8CAaoHlJ9tqC58O+wKiN+4OKg mdh0MesoA==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kWy2o-0001fj-El; Mon, 26 Oct 2020 08:39:58 +0000 Received: from mail.kernel.org ([198.145.29.99]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kWy23-0001Ia-5v; Mon, 26 Oct 2020 08:39:15 +0000 Received: from aquarius.haifa.ibm.com (nesher1.haifa.il.ibm.com [195.110.40.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id B64FF223FD; Mon, 26 Oct 2020 08:39:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1603701549; bh=T40iRaLwQz12+uFzwB/HBDTs/ag1fyJ+pXo/CUd+ftI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=i7+DYdgOOAED08WdB438GSwMzbRCSH2loQlCtjivn8vq8W/9nnEK6bhy7qz1XvW3u NKqyOS+maJC6srqMf/a7rQA9IxD49vN2IB2I3M4qT/LKcoZ/uQm1FOja3q0bMtX4m9 A6rZ5frExyjgLh6+IZRHIEIcyvqBOG8xjZLL/nuY= From: Mike Rapoport To: Andrew Morton Subject: [PATCH v7 6/7] mm: secretmem: use PMD-size pages to amortize direct map fragmentation Date: Mon, 26 Oct 2020 10:37:51 +0200 Message-Id: <20201026083752.13267-7-rppt@kernel.org> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201026083752.13267-1-rppt@kernel.org> References: <20201026083752.13267-1-rppt@kernel.org> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20201026_043911_568404_3BB65633 X-CRM114-Status: GOOD ( 26.18 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mark Rutland , David Hildenbrand , Peter Zijlstra , Catalin Marinas , Dave Hansen , linux-mm@kvack.org, linux-kselftest@vger.kernel.org, "H. Peter Anvin" , Christopher Lameter , Shuah Khan , Thomas Gleixner , Elena Reshetova , linux-arch@vger.kernel.org, Tycho Andersen , linux-nvdimm@lists.01.org, Will Deacon , x86@kernel.org, Matthew Wilcox , Mike Rapoport , Ingo Molnar , Michael Kerrisk , Arnd Bergmann , James Bottomley , Borislav Petkov , Alexander Viro , Andy Lutomirski , Paul Walmsley , "Kirill A. Shutemov" , Dan Williams , linux-arm-kernel@lists.infradead.org, linux-api@vger.kernel.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, Palmer Dabbelt , linux-fsdevel@vger.kernel.org, Rick Edgecombe , Mike Rapoport Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Mike Rapoport Removing a PAGE_SIZE page from the direct map every time such page is allocated for a secret memory mapping will cause severe fragmentation of the direct map. This fragmentation can be reduced by using PMD-size pages as a pool for small pages for secret memory mappings. Add a gen_pool per secretmem inode and lazily populate this pool with PMD-size pages. Signed-off-by: Mike Rapoport --- mm/secretmem.c | 124 ++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 107 insertions(+), 17 deletions(-) diff --git a/mm/secretmem.c b/mm/secretmem.c index 2a63db2ed132..4f9e07d212be 100644 --- a/mm/secretmem.c +++ b/mm/secretmem.c @@ -12,8 +12,10 @@ #include #include #include +#include #include #include +#include #include #include @@ -40,24 +42,80 @@ #define SECRETMEM_FLAGS_MASK SECRETMEM_MODE_MASK struct secretmem_ctx { + struct gen_pool *pool; unsigned int mode; }; -static struct page *secretmem_alloc_page(gfp_t gfp) +static int secretmem_pool_increase(struct secretmem_ctx *ctx, gfp_t gfp) { + unsigned long nr_pages = (1 << PMD_PAGE_ORDER); + struct gen_pool *pool = ctx->pool; + unsigned long addr; + struct page *page; + int err; + + page = alloc_pages(gfp | __GFP_ACCOUNT, PMD_PAGE_ORDER); + if (!page) + return -ENOMEM; + + addr = (unsigned long)page_address(page); + + err = set_direct_map_invalid_noflush(page, nr_pages); + if (err) + goto err_free_pages; + + err = gen_pool_add(pool, addr, PMD_SIZE, NUMA_NO_NODE); + if (err) + goto err_set_direct_map; + + split_page(page, PMD_PAGE_ORDER); + flush_tlb_kernel_range(addr, addr + PMD_SIZE); + + return 0; + +err_set_direct_map: /* - * FIXME: use a cache of large pages to reduce the direct map - * fragmentation + * If a split of PUD-size page was required, it already happened + * when we made the pages invalid which guarantees that this call + * won't fail */ - return alloc_page(gfp); + set_direct_map_default_noflush(page, nr_pages); + +err_free_pages: + __free_pages(page, PMD_PAGE_ORDER); + return err; +} + +static struct page *secretmem_alloc_page(struct secretmem_ctx *ctx, + gfp_t gfp) +{ + struct gen_pool *pool = ctx->pool; + unsigned long addr; + struct page *page; + int err; + + if (gen_pool_avail(pool) < PAGE_SIZE) { + err = secretmem_pool_increase(ctx, gfp); + if (err) + return NULL; + } + + addr = gen_pool_alloc(pool, PAGE_SIZE); + if (!addr) + return NULL; + + page = virt_to_page(addr); + get_page(page); + + return page; } static vm_fault_t secretmem_fault(struct vm_fault *vmf) { + struct secretmem_ctx *ctx = vmf->vma->vm_file->private_data; struct address_space *mapping = vmf->vma->vm_file->f_mapping; struct inode *inode = file_inode(vmf->vma->vm_file); pgoff_t offset = vmf->pgoff; - unsigned long addr; struct page *page; int ret = 0; @@ -66,22 +124,22 @@ static vm_fault_t secretmem_fault(struct vm_fault *vmf) page = find_get_entry(mapping, offset); if (!page) { - page = secretmem_alloc_page(vmf->gfp_mask); + page = secretmem_alloc_page(ctx, vmf->gfp_mask); if (!page) return vmf_error(-ENOMEM); + /* + * add_to_page_cache() calls mem_cgroup_charge(), so we + * need to uncharge here to avoid double accounting + */ + memcg_kmem_uncharge_page(page, 0); + ret = add_to_page_cache(page, mapping, offset, vmf->gfp_mask); if (unlikely(ret)) goto err_put_page; - ret = set_direct_map_invalid_noflush(page, 1); - if (ret) - goto err_del_page_cache; - - addr = (unsigned long)page_address(page); - flush_tlb_kernel_range(addr, addr + PAGE_SIZE); - __SetPageUptodate(page); + set_page_private(page, (unsigned long)ctx); ret = VM_FAULT_LOCKED; } @@ -89,8 +147,6 @@ static vm_fault_t secretmem_fault(struct vm_fault *vmf) vmf->page = page; return ret; -err_del_page_cache: - delete_from_page_cache(page); err_put_page: put_page(page); return vmf_error(ret); @@ -143,7 +199,11 @@ static int secretmem_migratepage(struct address_space *mapping, static void secretmem_freepage(struct page *page) { - set_direct_map_default_noflush(page, 1); + unsigned long addr = (unsigned long)page_address(page); + struct secretmem_ctx *ctx = (struct secretmem_ctx *)page_private(page); + struct gen_pool *pool = ctx->pool; + + gen_pool_free(pool, addr, PAGE_SIZE); } static const struct address_space_operations secretmem_aops = { @@ -178,13 +238,18 @@ static struct file *secretmem_file_create(unsigned long flags) if (!ctx) goto err_free_inode; + ctx->pool = gen_pool_create(PAGE_SHIFT, NUMA_NO_NODE); + if (!ctx->pool) + goto err_free_ctx; + file = alloc_file_pseudo(inode, secretmem_mnt, "secretmem", O_RDWR, &secretmem_fops); if (IS_ERR(file)) - goto err_free_ctx; + goto err_free_pool; mapping_set_unevictable(inode->i_mapping); + inode->i_private = ctx; inode->i_mapping->private_data = ctx; inode->i_mapping->a_ops = &secretmem_aops; @@ -198,6 +263,8 @@ static struct file *secretmem_file_create(unsigned long flags) return file; +err_free_pool: + gen_pool_destroy(ctx->pool); err_free_ctx: kfree(ctx); err_free_inode: @@ -236,11 +303,34 @@ SYSCALL_DEFINE1(memfd_secret, unsigned long, flags) return err; } +static void secretmem_cleanup_chunk(struct gen_pool *pool, + struct gen_pool_chunk *chunk, void *data) +{ + unsigned long start = chunk->start_addr; + unsigned long end = chunk->end_addr; + unsigned long nr_pages, addr; + + nr_pages = (end - start + 1) / PAGE_SIZE; + __kernel_map_pages(virt_to_page(start), nr_pages, 1); + + for (addr = start; addr < end; addr += PAGE_SIZE) + put_page(virt_to_page(addr)); +} + +static void secretmem_cleanup_pool(struct secretmem_ctx *ctx) +{ + struct gen_pool *pool = ctx->pool; + + gen_pool_for_each_chunk(pool, secretmem_cleanup_chunk, ctx); + gen_pool_destroy(pool); +} + static void secretmem_evict_inode(struct inode *inode) { struct secretmem_ctx *ctx = inode->i_private; truncate_inode_pages_final(&inode->i_data); + secretmem_cleanup_pool(ctx); clear_inode(inode); kfree(ctx); } -- 2.28.0 _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv