From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.6 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 856C6C43603 for ; Tue, 18 May 2021 10:35:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6688F611CC for ; Tue, 18 May 2021 10:35:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348557AbhERKhC (ORCPT ); Tue, 18 May 2021 06:37:02 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:23320 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1348448AbhERKg7 (ORCPT ); Tue, 18 May 2021 06:36:59 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1621334141; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4vp8lNy9snZR0aaM1NkHvO7WK4qnZ3T1+NZIbLJBP3o=; b=IqXaHFo9zpj78XNPpcpDfJf4LPFKu0hP1fZbJXefD6Lnr/sM3c/nnVQOTnNm1kCoWUOQOB +5Ik0wgU6Og+rTEa0X9VRjxXLQn01jeSWAfSkO/bws4qSBV2t2cd9FimKdhL3dECNr7eZC SlU522kqhceEUPsWUrGp7VFVsxoVuOQ= Received: from mail-wr1-f69.google.com (mail-wr1-f69.google.com [209.85.221.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-145-rgxGATcJOpubPl_Z-MbjFQ-1; Tue, 18 May 2021 06:35:40 -0400 X-MC-Unique: rgxGATcJOpubPl_Z-MbjFQ-1 Received: by mail-wr1-f69.google.com with SMTP id u5-20020adf9e050000b029010df603f280so5359419wre.18 for ; Tue, 18 May 2021 03:35:40 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:organization :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=4vp8lNy9snZR0aaM1NkHvO7WK4qnZ3T1+NZIbLJBP3o=; b=RVsS9cQFvE48R4Z37Mlp75ELItVs417hXMU1MzzonS9+UxAMlCx2ijGdB8IA32we7Y oLA6N8rE7AA3ukUfa9EcZXlJ/qJVIQnqOdQf32srUTaMK5Yz3BiyjoxP5XBVcthnlf54 mmRroOGR0zBzAri06aqCk8Z/kWQzEMI6PGwSgKhToYBTH/dBA4yd8kvd0VlSJwZ7L+ux qfWpSc52R8cnPrvYSw5anZQuqO/G/IJ5Zpl1Su/VajEXDiLV9Qubvr9q2p4Jwf+r/RgV VtwuCtx8ZTpLG0g2p9ZpGH6HbNLdBxEOgPQ0ZV71DFE03aDcM7eRzaJ9hAnO7iQl3nKp O2FQ== X-Gm-Message-State: AOAM532nqXo/VUHRhVQTNHetR+qlLirI4Fb3rhWcPOMq1sEi6UPwfr+6 kaRWl8uGaVFh1Zg0CKunUSc/WVgV78/Pmb96IqKINMbTnQd8DxCpRobfjJ3JdCgGZC3JLiaR8+q 3sUZWEqRQvHF7Vo3i3OXrA5fM X-Received: by 2002:a5d:4c91:: with SMTP id z17mr6097340wrs.349.1621334139161; Tue, 18 May 2021 03:35:39 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwivDOaaOEtmmfekeSrzIxF+whFkS4KLINBSXbPJdxKCFo9r8ATuUwdqMoXxsMMTM3AMH6k8g== X-Received: by 2002:a5d:4c91:: with SMTP id z17mr6097320wrs.349.1621334138962; Tue, 18 May 2021 03:35:38 -0700 (PDT) Received: from [192.168.3.132] (p5b0c64fd.dip0.t-ipconnect.de. [91.12.100.253]) by smtp.gmail.com with ESMTPSA id s15sm1725827wrt.54.2021.05.18.03.35.37 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 18 May 2021 03:35:38 -0700 (PDT) Subject: Re: [PATCH v19 5/8] mm: introduce memfd_secret system call to create "secret" memory areas To: Michal Hocko Cc: Mike Rapoport , Andrew Morton , Alexander Viro , Andy Lutomirski , Arnd Bergmann , Borislav Petkov , Catalin Marinas , Christopher Lameter , Dan Williams , Dave Hansen , Elena Reshetova , "H. Peter Anvin" , Hagen Paul Pfeifer , Ingo Molnar , James Bottomley , Kees Cook , "Kirill A. Shutemov" , Matthew Wilcox , Matthew Garrett , Mark Rutland , Mike Rapoport , Michael Kerrisk , Palmer Dabbelt , Palmer Dabbelt , Paul Walmsley , Peter Zijlstra , "Rafael J. Wysocki" , Rick Edgecombe , Roman Gushchin , Shakeel Butt , Shuah Khan , Thomas Gleixner , Tycho Andersen , Will Deacon , Yury Norov , linux-api@vger.kernel.org, linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-nvdimm@lists.01.org, linux-riscv@lists.infradead.org, x86@kernel.org References: <20210513184734.29317-1-rppt@kernel.org> <20210513184734.29317-6-rppt@kernel.org> <8e114f09-60e4-2343-1c42-1beaf540c150@redhat.com> From: David Hildenbrand Organization: Red Hat Message-ID: <00644dd8-edac-d3fd-a080-0a175fa9bf13@redhat.com> Date: Tue, 18 May 2021 12:35:36 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.10.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 18.05.21 12:31, Michal Hocko wrote: > On Tue 18-05-21 12:06:42, David Hildenbrand wrote: >> On 18.05.21 11:59, Michal Hocko wrote: >>> On Sun 16-05-21 10:29:24, Mike Rapoport wrote: >>>> On Fri, May 14, 2021 at 11:25:43AM +0200, David Hildenbrand wrote: >>> [...] >>>>>> + if (!page) >>>>>> + return VM_FAULT_OOM; >>>>>> + >>>>>> + err = set_direct_map_invalid_noflush(page, 1); >>>>>> + if (err) { >>>>>> + put_page(page); >>>>>> + return vmf_error(err); >>>>> >>>>> Would we want to translate that to a proper VM_FAULT_..., which would most >>>>> probably be VM_FAULT_OOM when we fail to allocate a pagetable? >>>> >>>> That's what vmf_error does, it translates -ESOMETHING to VM_FAULT_XYZ. >>> >>> I haven't read through the rest but this has just caught my attention. >>> Is it really reasonable to trigger the oom killer when you cannot >>> invalidate the direct mapping. From a quick look at the code it is quite >>> unlikely to se ENOMEM from that path (it allocates small pages) but this >>> can become quite sublte over time. Shouldn't this simply SIGBUS if it >>> cannot manipulate the direct mapping regardless of the underlying reason >>> for that? >>> >> >> OTOH, it means our kernel zones are depleted, so we'd better reclaim somehow >> ... > > Killing a userspace seems to be just a bad way around that. > > Although I have to say openly that I am not a great fan of VM_FAULT_OOM > in general. It is usually a a wrong way to tell the handle the failure > because it happens outside of the allocation context so you lose all the > details (e.g. allocation constrains, numa policy etc.). Also whenever > there is ENOMEM then the allocation itself has already made sure that > all the reclaim attempts have been already depleted. Just consider an > allocation with GFP_NOWAIT/NO_RETRY or similar to fail and propagate > ENOMEM up the call stack. Turning that into the OOM killer sounds like a > bad idea to me. But that is a more general topic. I have tried to bring > this up in the past but there was not much of an interest to fix it as > it was not a pressing problem... > I'm certainly interested; it would mean that we actually want to try recovering from VM_FAULT_OOM in various cases, and as you state, we might have to supply more information to make that work reliably. Having that said, I guess what we have here is just the same as when our process fails to allocate a generic page table in __handle_mm_fault(), when we fail p4d_alloc() and friends ... -- Thanks, David / dhildenb