From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D50FCC77B6E for ; Thu, 13 Apr 2023 12:41:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5D5F1900003; Thu, 13 Apr 2023 08:41:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 585E7900002; Thu, 13 Apr 2023 08:41:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 44E03900003; Thu, 13 Apr 2023 08:41:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 361F7900002 for ; Thu, 13 Apr 2023 08:41:52 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id DDA86160336 for ; Thu, 13 Apr 2023 12:41:51 +0000 (UTC) X-FDA: 80676329622.28.2839F9A Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf30.hostedemail.com (Postfix) with ESMTP id 6BA468000E for ; Thu, 13 Apr 2023 12:41:48 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ZbwGg8PJ; spf=pass (imf30.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681389708; a=rsa-sha256; cv=none; b=yGCRl6BmfWctdgmx8KNU578+PdqniC8bgv7VxMOKLQcJoezCSsnhg476ofbtZp9QYbqOWD Z4R/LETImtLwM+1PNDkMoFR7ELV5aqntVtEJktOKcpW3CK52ROcWfl+7V5SrgRMHPaC5zE cD2s1ag5XpPcW1+Aw9c9OuTMJHUIuZM= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ZbwGg8PJ; spf=pass (imf30.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681389708; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=t3dYaUxw3D2j5vjazPvcJJsjf7qa/tkOt/AsUugtvjw=; b=bfpQzBKIpqTMUYfbpi8SF6m315XAc56YLT10LagKFVTj1nWp+yP/UhspfIYgFJicW53Suf nY1Ldpe0FxngaBOb8k8xpzSnvQY1WG68+P0kdmvBwCg3pKz6CNl1edQ6xhQhC68zz8fQaS EayLrn5E6Tfal8g/0p+cEqgpyKX05CI= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1681389707; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=t3dYaUxw3D2j5vjazPvcJJsjf7qa/tkOt/AsUugtvjw=; b=ZbwGg8PJkMryusvRRo5KWnKEZAE3BRJaNbTjlu9UvG1EsoaXcUB5m4AiFXCpZDmSaM+43M BL8da3kv33w3dOYkrjfHJ0mOtzNXPY5wGUoEoXebgftWcN72G441MD1Peg5oEvZ1NCeI5x hbO2735vqbwLvKggM/mGQ6jkeWc72EM= Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-279-RDAvrM4mPxC5ls881UcrJg-1; Thu, 13 Apr 2023 08:41:46 -0400 X-MC-Unique: RDAvrM4mPxC5ls881UcrJg-1 Received: by mail-wm1-f69.google.com with SMTP id j15-20020a05600c1c0f00b003f0a83bf082so887869wms.8 for ; Thu, 13 Apr 2023 05:41:46 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681389705; x=1683981705; h=content-transfer-encoding:in-reply-to:subject:organization:from :references:cc:to:content-language:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=t3dYaUxw3D2j5vjazPvcJJsjf7qa/tkOt/AsUugtvjw=; b=LyGUS3ok9Cf/OFRBhsqsMiDCJX1HAiImiUAQCUwyrMWU+gMbuDFC+tJ6Bi7oBiNCi/ h6JA9CNbXiOXJmyMuqzxbUlkXXustsZgebv8YtWKpgDL9tjEJeo60Tvta1mKS8GBse0F h8AIRojerw3j9zVxWdz6921IuGMB5Jwx2VxO52rGBCTP40t/k0Bj8GFZiiq8y6268AZf SGF11CiSpkG/1aQKiw2VyGnUwRCwcRQVX+h33+7EY6IF4ZRuSHVaaz8Is0GbLjMB/ANn NScBijRfjukMc3I4Et1jDMzkpKq5r3EQZ50Ozmkzt27x0xKZhKmdr3k12xJdxyMpXTL/ J0eQ== X-Gm-Message-State: AAQBX9duKbTLrH/3t99fOR6Eu8WOP5hRDSDvIcjDK8zcjIvyu42nVSug mgH6heTg6YqyYgRBTmgq4O1uuzQTJiYg1jJwMYpfdtvuvL7JIK4MZpgOko4MEpnb+dCT0HeJLrF VcCfNI96dwmg= X-Received: by 2002:a05:6000:182:b0:2c9:b9bf:e20c with SMTP id p2-20020a056000018200b002c9b9bfe20cmr1384302wrx.2.1681389705394; Thu, 13 Apr 2023 05:41:45 -0700 (PDT) X-Google-Smtp-Source: AKy350aTY91v5mpZFzN5+r31N2yPV2YStw3MrraVcnXey4pmNW0tl7fgMD6B90YOR+GLQRT9XfAbNA== X-Received: by 2002:a05:6000:182:b0:2c9:b9bf:e20c with SMTP id p2-20020a056000018200b002c9b9bfe20cmr1384276wrx.2.1681389705044; Thu, 13 Apr 2023 05:41:45 -0700 (PDT) Received: from ?IPV6:2a09:80c0:192:0:5dac:bf3d:c41:c3e7? ([2a09:80c0:192:0:5dac:bf3d:c41:c3e7]) by smtp.gmail.com with ESMTPSA id b7-20020adfde07000000b002f3fcb1869csm1222217wrm.64.2023.04.13.05.41.44 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 13 Apr 2023 05:41:44 -0700 (PDT) Message-ID: Date: Thu, 13 Apr 2023 14:41:43 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.9.1 To: David Howells Cc: "Teterevkov, Ivan" , "linux-mm@kvack.org" , "jhubbard@nvidia.com" , "jack@suse.cz" , "rppt@linux.ibm.com" , "jglisse@redhat.com" , "ira.weiny@intel.com" , "linux-kernel@vger.kernel.org" , Christoph Hellwig , Matthew Wilcox References: <93f2614e-4521-8bc8-2eca-e7ad03e7e399@redhat.com> <37946.1681288867@warthog.procyon.org.uk> From: David Hildenbrand Organization: Red Hat Subject: Re: find_get_page() VS pin_user_pages() In-Reply-To: <37946.1681288867@warthog.procyon.org.uk> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Queue-Id: 6BA468000E X-Rspamd-Server: rspam01 X-Stat-Signature: td3hceuc8uh3ki1itcm7c64ygxojkow9 X-HE-Tag: 1681389708-611350 X-HE-Meta: U2FsdGVkX1+dWgPbKF7MflHgRE/qNBpwBZUc8O9psNEexKSjnteIolGReY0gua7ARIg3Kycb6L8BeKI2Hg9WbiuPW21KgvUW5xGM4wztdmu7lAqXuLAcX1w3bIPM7IuEv5loKMlOjlOzZbrAWrzrtFjKRx4zHEDICqxkwXpa4P/isT9uA7BB6ttvL5O1OgcmCNEje9mxehMZK2gn7Nc/TrwqHCrrC3A8n1W180FRJNN/BBnMom0ZbEBF5GJ5vHqzDou7q13gaWfij7eOZWzelz4mv7wtxt9Y3WdtPgSFuhrgdMEy6x0dzqgjt9hxycxTxVRTOlhPywK61YYnaIScOhtdXCWndGyMNQ+6thdv+BtGezIOwToQLk21zHLXFCeDbZAWjsL2phQ8oEAv9mfPPhY4PDbDNkIzB6UganievlvebFXP5BfoA5gOUSXHCnQjeYgQW6B/vTTq+GyCMxSxvWFKuxXvT/Xs4jPbWp9mEkNjXqlhcUR8J7rlLKBPa4a2tFTo6lZDeMD+VUI557fj5kNLTUczv2fBuaYCoiIiT/zBCMQ58kmg2W7GXysq6fc9aFwsZDoIBrow9nua2G08vW09r7u06T1jkK2cobBkiQMTc7c7qbX425mOoBcLYf/IlB+PUsQH2CXD0DUj1AUkSCdz3xjAl2j9yUAJHJ4T3QfzPYJmn680OAQaHOYH477C7xrA4ovdzPtbJfVh8MBKPtj6gprMIr1GjSsPKXBPmtiycsHxe6Z6bCV5lWm+OzCrOpRwBUyVfOxGusmhw+Fk/FFsgwgoUVKhIXPleNgEeo2Oh1q2NIu5MqI3Uj+gw6XFZaRHvkyOxFp5vcdp9Ke8GQTOPVivxqeUSUoyekdU/1iTuOTUtJ8E8FbMWw8YMp5Qg4oBbjta2wswmaq0tXoadF0U67Qkj9WLPDshzfGJGkdR9s/RFng+pBMH4+yj6bYYJOda2z6IHyuWeeP/UN1 rK6G2Jqe vUwBUeG410724Hj6jeSnaqjuWTW/cVfpxPeLBg6c8dFory0i/KUEPD8dpOb9VBvVhL5Gk2ItKSwjMvA7LsBr8vzXwVD60ih1vX2/YPAo0Mg0x1h9GDT9Ed9QP4QcEIwDiRINXqhxRFaVGkUYtxdQkr2AqJh2EnxizbOYX1TE05cHoVqPO1/wmK03gLa0JKh3hswbnaQXsXZQ6+QWNnpiLvIOwHDqBPkzqH09zeCg5Itpzmt0VJaQ8PjslihtIZSJMlE4eTiidlI5K9h+SwbE9vYgEoI7igTmMTwZaF7/ilvZsg36hzdWKjtN8526I1JtqNUfTmx427pErB0+CGxh8Y9I7YR4zmRpR42qzbSACQ4kFfJnuei+CokpyONhsG4CFMyzMAVOxAFCKRQ1UBBU5H/Nic3Ye0hULBMO50ArHxM52Xo7qAH3lDi6vHjYY0wowLNgKb0aN2li7FgPOZXTzoy6Mbf8kHYXxIjwQydOZutflQ5EtI+Ryy3DQsd+b/5o5EGAYa3rZLitU1epm4xT4AXo+lEq6N1qbd8TX X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 12.04.23 10:41, David Howells wrote: > David Hildenbrand wrote: > >> I suspect that find_get_page() is not the kind of interface you want to use >> for the purpose you describe. find_get_page() is a wrapper around >> pagecache_get_page() and seems more like a helper for implementing an fs >> (looking at the users and the fact that it only considers pages that are in >> the pagecache). > > Btw, at some point we're going to need public functions to get extra pins on > pages. vmsplice() should be pinning the pages it pushes into a pipe - so all > pages in a pipe should probably be pinned - and anyone who splices a page out > of a pipe and retains it (skbuffs spring strongly to mind) should also get a > pin on the page. As discussed, vmsplice() is a bit special, because it has longterm-pinning semantics: we'd want to migrate the page out of ZONE_MOVABLE/MIGRATE_CMA/... because the page might remain pinned in the pipe possibly forever, controlled by user space. pin_user_pages(FOLL_LONGTERM) would do the right thing, but we might ahve to be careful with extra pins. I guess it depends on what we want to achieve. Let's discuss what would happen when we want to pin some page (and not going via pin_user_page()) that's definitely not an anon page -- so let's assume a pagecache page: (a) Short-term pinning when already pinned (extra pins): easy. (b) Short-term pinning when not pinned yet: should be fairly easy (pin_user_pages() doesn't do anything special for pagecache pages either). (c) Long-term pinning when already long-term pinned (extra long-term pinnings): easy (d) Long-term pinning when already short-term pinned: problematic, because we might have to migrate the page first, but it's already pinned ... and if we obtained the page via pin_user_page() from a MAP_PRIVATE VMA, we'd have to do another pin_user_page(FOLL_LONGTERM) that would properly break COW and give us an anon page ... (e) Long-term pinning when not pinned yet: fairly easy, but we might have to migrate the page first (like FOLL_LONGTERM would). Regarding anon pages, we should pin only via pin_user_page(), so the "not pinned" case does not apply. Replicating pins -- (a) and (c) -- is usually easy, but (d) is similarly problematic. Focusing again on !anon pages: if it's just "get another short-term pin on an already pinned page", it's easy (and I recall John H. had patches). If it's "get a long-term pin on an already pinned page", it can be problematic. Any pages that will never have to be migrated when long-term pinning (just some allocated kernel page without MOVABLE semantics) are super easy to pin, and to add extra pins to. > > So should all pages held by an skbuff be pinned rather than ref'd? I have a > patch to use the bottom two bits of an skb frag's page pointer to keep track > of whether the page it points to is ref'd, pinned or neither, but if we can > make it pin/not-pin them, I only need one bit for that. It might possibly be the right thing. But ref'd vs. pinned really only makes a difference to (a) pages mapped into user space or (b) pages in the pageache. Of course, in any case, long-term semantics have to be respected if the page to pin might have been allocated with MOVABLE semantics. -- Thanks, David / dhildenb