From: John Hubbard <jhubbard@nvidia.com>
To: Jan Kara <jack@suse.cz>, Matthew Wilcox <willy@infradead.org>
Cc: "Michal Hocko" <mhocko@kernel.org>,
john.hubbard@gmail.com,
"Andrew Morton" <akpm@linux-foundation.org>,
"Christoph Hellwig" <hch@infradead.org>,
"Dan Williams" <dan.j.williams@intel.com>,
"Dave Chinner" <david@fromorbit.com>,
"Dave Hansen" <dave.hansen@linux.intel.com>,
"Ira Weiny" <ira.weiny@intel.com>,
"Jason Gunthorpe" <jgg@ziepe.ca>,
"Jérôme Glisse" <jglisse@redhat.com>,
LKML <linux-kernel@vger.kernel.org>,
amd-gfx@lists.freedesktop.org, ceph-devel@vger.kernel.org,
devel@driverdev.osuosl.org, devel@lists.orangefs.org,
dri-devel@lists.freedesktop.org, intel-gfx@lists.freedesktop.org,
kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
linux-block@vger.kernel.org, linux-crypto@vger.kernel.org,
linux-fbdev@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-media@vger.kernel.org, linux-mm@kvack.org,
linux-nfs@vger.kernel.org, linux-rdma@vger.kernel.org,
linux-rpi-kernel@lists.infradead.org, linux-xfs@vger.kernel.org,
netdev@vger.kernel.org, rds-devel@oss.oracle.com,
sparclinux@vger.kernel.org, x86@kernel.org,
xen-devel@lists.xenproject.org
Subject: Re: [PATCH 00/34] put_user_pages(): miscellaneous call sites
Date: Fri, 2 Aug 2019 12:14:09 -0700 [thread overview]
Message-ID: <076e7826-67a5-4829-aae2-2b90f302cebd@nvidia.com> (raw)
In-Reply-To: <20190802145227.GQ25064@quack2.suse.cz>
On 8/2/19 7:52 AM, Jan Kara wrote:
> On Fri 02-08-19 07:24:43, Matthew Wilcox wrote:
>> On Fri, Aug 02, 2019 at 02:41:46PM +0200, Jan Kara wrote:
>>> On Fri 02-08-19 11:12:44, Michal Hocko wrote:
>>>> On Thu 01-08-19 19:19:31, john.hubbard@gmail.com wrote:
>>>> [...]
>>>>> 2) Convert all of the call sites for get_user_pages*(), to
>>>>> invoke put_user_page*(), instead of put_page(). This involves dozens of
>>>>> call sites, and will take some time.
>>>>
>>>> How do we make sure this is the case and it will remain the case in the
>>>> future? There must be some automagic to enforce/check that. It is simply
>>>> not manageable to do it every now and then because then 3) will simply
>>>> be never safe.
>>>>
>>>> Have you considered coccinele or some other scripted way to do the
>>>> transition? I have no idea how to deal with future changes that would
>>>> break the balance though.
Hi Michal,
Yes, I've thought about it, and coccinelle falls a bit short (it's not smart
enough to know which put_page()'s to convert). However, there is a debug
option planned: a yet-to-be-posted commit [1] uses struct page extensions
(obviously protected by CONFIG_DEBUG_GET_USER_PAGES_REFERENCES) to add
a redundant counter. That allows:
void __put_page(struct page *page)
{
...
/* Someone called put_page() instead of put_user_page() */
WARN_ON_ONCE(atomic_read(&page_ext->pin_count) > 0);
>>>
>>> Yeah, that's why I've been suggesting at LSF/MM that we may need to create
>>> a gup wrapper - say vaddr_pin_pages() - and track which sites dropping
>>> references got converted by using this wrapper instead of gup. The
>>> counterpart would then be more logically named as unpin_page() or whatever
>>> instead of put_user_page(). Sure this is not completely foolproof (you can
>>> create new callsite using vaddr_pin_pages() and then just drop refs using
>>> put_page()) but I suppose it would be a high enough barrier for missed
>>> conversions... Thoughts?
The debug option above is still a bit simplistic in its implementation (and maybe
not taking full advantage of the data it has), but I think it's preferable,
because it monitors the "core" and WARNs.
Instead of the wrapper, I'm thinking: documentation and the passage of time,
plus the debug option (perhaps enhanced--probably once I post it someone will
notice opportunities), yes?
>>
>> I think the API we really need is get_user_bvec() / put_user_bvec(),
>> and I know Christoph has been putting some work into that. That avoids
>> doing refcount operations on hundreds of pages if the page in question is
>> a huge page. Once people are switched over to that, they won't be tempted
>> to manually call put_page() on the individual constituent pages of a bvec.
>
> Well, get_user_bvec() is certainly a good API for one class of users but
> just looking at the above series, you'll see there are *many* places that
> just don't work with bvecs at all and you need something for those.
>
Yes, there are quite a few places that don't involve _bvec, as we can see
right here. So we need something. Andrew asked for a debug option some time
ago, and several people (Dave Hansen, Dan Williams, Jerome) had the idea
of vmap-ing gup pages separately, so you can definitely tell where each
page came from. I'm hoping not to have to go to that level of complexity
though.
[1] "mm/gup: debug tracking of get_user_pages() references" :
https://github.com/johnhubbard/linux/commit/21ff7d6161ec2a14d3f9d17c98abb00cc969d4d6
thanks,
--
John Hubbard
NVIDIA
next prev parent reply other threads:[~2019-08-02 19:16 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-02 2:19 [PATCH 00/34] put_user_pages(): miscellaneous call sites john.hubbard
2019-08-02 2:19 ` [PATCH 01/34] mm/gup: add make_dirty arg to put_user_pages_dirty_lock() john.hubbard
2019-08-02 2:19 ` [PATCH 02/34] net/rds: convert put_page() to put_user_page*() john.hubbard
2019-08-02 2:19 ` [PATCH 03/34] net/ceph: " john.hubbard
2019-08-02 22:32 ` Jeff Layton
2019-08-02 2:19 ` [PATCH 04/34] x86/kvm: " john.hubbard
2019-08-02 2:19 ` [PATCH 05/34] drm/etnaviv: convert release_pages() to put_user_pages() john.hubbard
2019-08-02 2:19 ` [PATCH 06/34] drm/i915: convert put_page() to put_user_page*() john.hubbard
2019-08-02 9:19 ` Joonas Lahtinen
2019-08-02 18:48 ` John Hubbard
2019-08-03 20:03 ` John Hubbard
2019-08-02 2:19 ` [PATCH 07/34] drm/radeon: " john.hubbard
2019-08-02 2:19 ` [PATCH 08/34] media/ivtv: " john.hubbard
2019-08-02 2:19 ` [PATCH 09/34] media/v4l2-core/mm: " john.hubbard
2019-08-02 2:19 ` [PATCH 10/34] genwqe: " john.hubbard
2019-08-03 7:06 ` Greg Kroah-Hartman
2019-08-02 2:19 ` [PATCH 11/34] scif: " john.hubbard
2019-08-02 2:19 ` [PATCH 12/34] vmci: " john.hubbard
2019-08-02 2:19 ` [PATCH 13/34] rapidio: " john.hubbard
2019-08-02 2:19 ` [PATCH 14/34] oradax: " john.hubbard
2019-08-02 2:19 ` [PATCH 15/34] staging/vc04_services: " john.hubbard
2019-08-03 7:06 ` Greg Kroah-Hartman
2019-08-02 2:19 ` [PATCH 16/34] drivers/tee: " john.hubbard
2019-08-02 6:29 ` Jens Wiklander
2019-08-02 18:51 ` John Hubbard
2019-08-02 2:19 ` [PATCH 17/34] vfio: " john.hubbard
2019-08-02 2:19 ` [PATCH 18/34] fbdev/pvr2fb: " john.hubbard
2019-08-02 2:19 ` [PATCH 19/34] fsl_hypervisor: " john.hubbard
2019-08-02 2:19 ` [PATCH 20/34] xen: " john.hubbard
2019-08-02 4:36 ` Juergen Gross
2019-08-02 5:48 ` John Hubbard
2019-08-02 6:10 ` Juergen Gross
2019-08-02 16:09 ` Weiny, Ira
2019-08-02 19:25 ` John Hubbard
2019-08-02 2:19 ` [PATCH 21/34] fs/exec.c: " john.hubbard
2019-08-02 2:19 ` [PATCH 22/34] orangefs: " john.hubbard
2019-08-02 2:19 ` [PATCH 23/34] uprobes: " john.hubbard
2019-08-02 2:19 ` [PATCH 24/34] futex: " john.hubbard
2019-08-02 2:19 ` [PATCH 25/34] mm/frame_vector.c: " john.hubbard
2019-08-02 2:19 ` [PATCH 26/34] mm/gup_benchmark.c: " john.hubbard
2019-08-02 14:19 ` Keith Busch
2019-08-02 2:19 ` [PATCH 27/34] mm/memory.c: " john.hubbard
2019-08-02 2:19 ` [PATCH 28/34] mm/madvise.c: " john.hubbard
2019-08-02 2:20 ` [PATCH 29/34] mm/process_vm_access.c: " john.hubbard
2019-08-02 2:20 ` [PATCH 30/34] crypt: " john.hubbard
2019-08-02 2:20 ` [PATCH 31/34] nfs: " john.hubbard
2019-08-03 1:27 ` Calum Mackay
2019-08-03 1:41 ` John Hubbard
2019-08-04 23:28 ` Calum Mackay
2019-08-02 2:20 ` [PATCH 32/34] goldfish_pipe: " john.hubbard
2019-08-02 2:20 ` [PATCH 33/34] kernel/events/core.c: " john.hubbard
2019-08-02 2:20 ` [PATCH 34/34] fs/binfmt_elf: " john.hubbard
2019-08-02 9:12 ` [PATCH 00/34] put_user_pages(): miscellaneous call sites Michal Hocko
2019-08-02 12:41 ` Jan Kara
2019-08-02 14:24 ` Matthew Wilcox
2019-08-02 14:52 ` Jan Kara
2019-08-02 19:14 ` John Hubbard [this message]
2019-08-07 8:37 ` Jan Kara
2019-08-07 8:46 ` Michal Hocko
2019-08-08 2:36 ` Ira Weiny
2019-08-08 3:46 ` John Hubbard
2019-08-08 16:25 ` Weiny, Ira
2019-08-08 18:18 ` John Hubbard
2019-08-09 8:34 ` Jan Kara
-- strict thread matches above, loose matches on Subject: below --
2019-08-02 2:16 john.hubbard
2019-08-02 2:39 ` John Hubbard
2019-08-02 8:05 ` Peter Zijlstra
2019-08-02 19:33 ` John Hubbard
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=076e7826-67a5-4829-aae2-2b90f302cebd@nvidia.com \
--to=jhubbard@nvidia.com \
--cc=akpm@linux-foundation.org \
--cc=amd-gfx@lists.freedesktop.org \
--cc=ceph-devel@vger.kernel.org \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=david@fromorbit.com \
--cc=devel@driverdev.osuosl.org \
--cc=devel@lists.orangefs.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=hch@infradead.org \
--cc=intel-gfx@lists.freedesktop.org \
--cc=ira.weiny@intel.com \
--cc=jack@suse.cz \
--cc=jgg@ziepe.ca \
--cc=jglisse@redhat.com \
--cc=john.hubbard@gmail.com \
--cc=kvm@vger.kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-crypto@vger.kernel.org \
--cc=linux-fbdev@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-media@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nfs@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=linux-rpi-kernel@lists.infradead.org \
--cc=linux-xfs@vger.kernel.org \
--cc=mhocko@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=rds-devel@oss.oracle.com \
--cc=sparclinux@vger.kernel.org \
--cc=willy@infradead.org \
--cc=x86@kernel.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).