From: David Hildenbrand <david@redhat.com>
To: linux-kernel@vger.kernel.org
Cc: x86@kernel.org, linux-alpha@vger.kernel.org,
linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org,
linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
sparclinux@vger.kernel.org, linux-um@lists.infradead.org,
etnaviv@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
linux-samsung-soc@vger.kernel.org, linux-rdma@vger.kernel.org,
linux-media@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-mm@kvack.org, linux-perf-users@vger.kernel.org,
linux-security-module@vger.kernel.org,
linux-kselftest@vger.kernel.org,
Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
Jason Gunthorpe <jgg@ziepe.ca>,
John Hubbard <jhubbard@nvidia.com>, Peter Xu <peterx@redhat.com>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Andrea Arcangeli <aarcange@redhat.com>,
Hugh Dickins <hughd@google.com>, Nadav Amit <namit@vmware.com>,
Vlastimil Babka <vbabka@suse.cz>,
Matthew Wilcox <willy@infradead.org>,
Mike Kravetz <mike.kravetz@oracle.com>,
Muchun Song <songmuchun@bytedance.com>,
Shuah Khan <shuah@kernel.org>,
Lucas Stach <l.stach@pengutronix.de>,
David Airlie <airlied@gmail.com>,
Oded Gabbay <ogabbay@kernel.org>, Arnd Bergmann <arnd@arndb.de>,
Christoph Hellwig <hch@infradead.org>,
Alex Williamson <alex.williamson@redhat.com>,
David Hildenbrand <david@redhat.com>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Andy Walls <awalls@md.metrocast.net>,
Anton Ivanov <anton.ivanov@cambridgegreys.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Bernard Metzler <bmt@zurich.ibm.com>,
Borislav Petkov <bp@alien8.de>,
Catalin Marinas <catalin.marinas@arm.com>,
Christian Benvenuti <benve@cisco.com>,
Christian Gmeiner <christian.gmeiner@gmail.com>,
Christophe Leroy <christophe.leroy@csgroup.eu>,
Daniel Vetter <daniel@ffwll.ch>,
Daniel Vetter <daniel.vetter@ffwll.ch>,
Dave Hansen <dave.hansen@linux.intel.com>,
"David S. Miller" <davem@davemloft.net>,
Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>,
Eric Biederman <ebiederm@xmission.com>,
Hans Verkuil <hverkuil@xs4all.nl>,
"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
Inki Dae <inki.dae@samsung.com>,
Ivan Kokshaysky <ink@jurassic.park.msu.ru>,
James Morris <jmorris@namei.org>, Jiri Olsa <jolsa@kernel.org>,
Johannes Berg <johannes@sipsolutions.net>,
Kees Cook <keescook@chromium.org>,
Kentaro Takeda <takedakn@nttdata.co.jp>,
Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>,
Kyungmin Park <kyungmin.park@samsung.com>,
Leon Romanovsky <leon@kernel.org>,
Leon Romanovsky <leonro@nvidia.com>,
Marek Szyprowski <m.szyprowski@samsung.com>,
Mark Rutland <mark.rutland@arm.com>,
Matt Turner <mattst88@gmail.com>,
Mauro Carvalho Chehab <mchehab@kernel.org>,
Michael Ellerman <mpe@ellerman.id.au>,
Namhyung Kim <namhyung@kernel.org>,
Nelson Escobar <neescoba@cisco.com>,
Nicholas Piggin <npiggin@gmail.com>,
Oleg Nesterov <oleg@redhat.com>, Paul Moore <paul@paul-moore.com>,
Peter Zijlstra <peterz@infradead.org>,
Richard Henderson <richard.henderson@linaro.org>,
Richard Weinberger <richard@nod.at>,
Russell King <linux+etnaviv@armlinux.org.uk>,
"Serge E. Hallyn" <serge@hallyn.com>,
Seung-Woo Kim <sw0312.kim@samsung.com>,
Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
Thomas Bogendoerfer <tsbogend@alpha.franken.de>,
Thomas Gleixner <tglx@linutronix.de>,
Tomasz Figa <tfiga@chromium.org>, Will Deacon <will@kernel.org>
Subject: [PATCH mm-unstable v1 00/20] mm/gup: remove FOLL_FORCE usage from drivers (reliable R/O long-term pinning)
Date: Wed, 16 Nov 2022 11:26:39 +0100 [thread overview]
Message-ID: <20221116102659.70287-1-david@redhat.com> (raw)
For now, we did not support reliable R/O long-term pinning in COW mappings.
That means, if we would trigger R/O long-term pinning in MAP_PRIVATE
mapping, we could end up pinning the (R/O-mapped) shared zeropage or a
pagecache page.
The next write access would trigger a write fault and replace the pinned
page by an exclusive anonymous page in the process page table; whatever the
process would write to that private page copy would not be visible by the
owner of the previous page pin: for example, RDMA could read stale data.
The end result is essentially an unexpected and hard-to-debug memory
corruption.
Some drivers tried working around that limitation by using
"FOLL_FORCE|FOLL_WRITE|FOLL_LONGTERM" for R/O long-term pinning for now.
FOLL_WRITE would trigger a write fault, if required, and break COW before
pinning the page. FOLL_FORCE is required because the VMA might lack write
permissions, and drivers wanted to make that working as well, just like
one would expect (no write access, but still triggering a write access to
break COW).
However, that is not a practical solution, because
(1) Drivers that don't stick to that undocumented and debatable pattern
would still run into that issue. For example, VFIO only uses
FOLL_LONGTERM for R/O long-term pinning.
(2) Using FOLL_WRITE just to work around a COW mapping + page pinning
limitation is unintuitive. FOLL_WRITE would, for example, mark the
page softdirty or trigger uffd-wp, even though, there actually isn't
going to be any write access.
(3) The purpose of FOLL_FORCE is debug access, not access without lack of
VMA permissions by arbitrarty drivers.
So instead, make R/O long-term pinning work as expected, by breaking COW
in a COW mapping early, such that we can remove any FOLL_FORCE usage from
drivers and make FOLL_FORCE ptrace-specific (renaming it to FOLL_PTRACE).
More details in patch #8.
Patches #1--#3 add COW tests for non-anonymous pages.
Patches #4--#7 prepare core MM for extended FAULT_FLAG_UNSHARE support in
COW mappings.
Patch #8 implements reliable R/O long-term pinning in COW mappings
Patches #9--#19 remove any FOLL_FORCE usage from drivers.
Patch #20 renames FOLL_FORCE to FOLL_PTRACE.
I'm refraining from CCing all driver/arch maintainers on the whole patch
set, but only CC them on the cover letter and the applicable patch
(I know, I know, someone is always unhappy ... sorry).
RFC -> v1:
* Use term "ptrace" instead of "debuggers" in patch descriptions
* Added ACK/Tested-by
* "mm/frame-vector: remove FOLL_FORCE usage"
-> Adjust description
* "mm: rename FOLL_FORCE to FOLL_PTRACE"
-> Added
David Hildenbrand (20):
selftests/vm: anon_cow: prepare for non-anonymous COW tests
selftests/vm: cow: basic COW tests for non-anonymous pages
selftests/vm: cow: R/O long-term pinning reliability tests for
non-anon pages
mm: add early FAULT_FLAG_UNSHARE consistency checks
mm: add early FAULT_FLAG_WRITE consistency checks
mm: rework handling in do_wp_page() based on private vs. shared
mappings
mm: don't call vm_ops->huge_fault() in wp_huge_pmd()/wp_huge_pud() for
private mappings
mm: extend FAULT_FLAG_UNSHARE support to anything in a COW mapping
mm/gup: reliable R/O long-term pinning in COW mappings
RDMA/umem: remove FOLL_FORCE usage
RDMA/usnic: remove FOLL_FORCE usage
RDMA/siw: remove FOLL_FORCE usage
media: videobuf-dma-sg: remove FOLL_FORCE usage
drm/etnaviv: remove FOLL_FORCE usage
media: pci/ivtv: remove FOLL_FORCE usage
mm/frame-vector: remove FOLL_FORCE usage
drm/exynos: remove FOLL_FORCE usage
RDMA/hw/qib/qib_user_pages: remove FOLL_FORCE usage
habanalabs: remove FOLL_FORCE usage
mm: rename FOLL_FORCE to FOLL_PTRACE
arch/alpha/kernel/ptrace.c | 6 +-
arch/arm64/kernel/mte.c | 2 +-
arch/ia64/kernel/ptrace.c | 10 +-
arch/mips/kernel/ptrace32.c | 4 +-
arch/mips/math-emu/dsemul.c | 2 +-
arch/powerpc/kernel/ptrace/ptrace32.c | 4 +-
arch/sparc/kernel/ptrace_32.c | 4 +-
arch/sparc/kernel/ptrace_64.c | 8 +-
arch/x86/kernel/step.c | 2 +-
arch/x86/um/ptrace_32.c | 2 +-
arch/x86/um/ptrace_64.c | 2 +-
drivers/gpu/drm/etnaviv/etnaviv_gem.c | 8 +-
drivers/gpu/drm/exynos/exynos_drm_g2d.c | 2 +-
drivers/infiniband/core/umem.c | 8 +-
drivers/infiniband/hw/qib/qib_user_pages.c | 2 +-
drivers/infiniband/hw/usnic/usnic_uiom.c | 9 +-
drivers/infiniband/sw/siw/siw_mem.c | 9 +-
drivers/media/common/videobuf2/frame_vector.c | 2 +-
drivers/media/pci/ivtv/ivtv-udma.c | 2 +-
drivers/media/pci/ivtv/ivtv-yuv.c | 5 +-
drivers/media/v4l2-core/videobuf-dma-sg.c | 14 +-
drivers/misc/habanalabs/common/memory.c | 3 +-
fs/exec.c | 2 +-
fs/proc/base.c | 2 +-
include/linux/mm.h | 35 +-
include/linux/mm_types.h | 8 +-
kernel/events/uprobes.c | 4 +-
kernel/ptrace.c | 12 +-
mm/gup.c | 38 +-
mm/huge_memory.c | 13 +-
mm/hugetlb.c | 14 +-
mm/memory.c | 97 +++--
mm/util.c | 4 +-
security/tomoyo/domain.c | 2 +-
tools/testing/selftests/vm/.gitignore | 2 +-
tools/testing/selftests/vm/Makefile | 10 +-
tools/testing/selftests/vm/check_config.sh | 4 +-
.../selftests/vm/{anon_cow.c => cow.c} | 387 +++++++++++++++++-
tools/testing/selftests/vm/run_vmtests.sh | 8 +-
39 files changed, 575 insertions(+), 177 deletions(-)
rename tools/testing/selftests/vm/{anon_cow.c => cow.c} (75%)
--
2.38.1
next reply other threads:[~2022-11-16 10:31 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-16 10:26 David Hildenbrand [this message]
2022-11-16 10:26 ` [PATCH mm-unstable v1 01/20] selftests/vm: anon_cow: prepare for non-anonymous COW tests David Hildenbrand
2022-11-18 16:20 ` Vlastimil Babka
2022-11-16 10:26 ` [PATCH mm-unstable v1 02/20] selftests/vm: cow: basic COW tests for non-anonymous pages David Hildenbrand
2022-11-16 10:26 ` [PATCH mm-unstable v1 03/20] selftests/vm: cow: R/O long-term pinning reliability tests for non-anon pages David Hildenbrand
2022-11-16 10:26 ` [PATCH mm-unstable v1 04/20] mm: add early FAULT_FLAG_UNSHARE consistency checks David Hildenbrand
2022-11-18 16:45 ` Vlastimil Babka
2022-11-16 10:26 ` [PATCH mm-unstable v1 05/20] mm: add early FAULT_FLAG_WRITE " David Hildenbrand
2022-11-18 17:03 ` Vlastimil Babka
2022-11-16 10:26 ` [PATCH mm-unstable v1 06/20] mm: rework handling in do_wp_page() based on private vs. shared mappings David Hildenbrand
2022-11-22 14:20 ` Vlastimil Babka
2022-11-16 10:26 ` [PATCH mm-unstable v1 07/20] mm: don't call vm_ops->huge_fault() in wp_huge_pmd()/wp_huge_pud() for private mappings David Hildenbrand
2022-11-22 14:50 ` Vlastimil Babka
2022-11-16 10:26 ` [PATCH mm-unstable v1 08/20] mm: extend FAULT_FLAG_UNSHARE support to anything in a COW mapping David Hildenbrand
2022-11-22 15:35 ` Vlastimil Babka
2022-11-16 10:26 ` [PATCH mm-unstable v1 09/20] mm/gup: reliable R/O long-term pinning in COW mappings David Hildenbrand
2022-11-16 10:42 ` Daniel Vetter
2022-11-22 16:29 ` Vlastimil Babka
2022-11-24 1:29 ` John Hubbard
2022-11-16 10:26 ` [PATCH mm-unstable v1 10/20] RDMA/umem: remove FOLL_FORCE usage David Hildenbrand
2022-11-17 0:45 ` Jason Gunthorpe
2022-11-16 10:26 ` [PATCH mm-unstable v1 11/20] RDMA/usnic: " David Hildenbrand
2022-11-17 0:45 ` Jason Gunthorpe
2022-11-16 10:26 ` [PATCH mm-unstable v1 12/20] RDMA/siw: " David Hildenbrand
2022-11-17 0:46 ` Jason Gunthorpe
2022-11-16 10:26 ` [PATCH mm-unstable v1 13/20] media: videobuf-dma-sg: " David Hildenbrand
2022-11-16 10:48 ` Daniel Vetter
2022-11-23 13:17 ` Hans Verkuil
2022-11-16 10:26 ` [PATCH mm-unstable v1 14/20] drm/etnaviv: " David Hildenbrand
2022-11-16 10:49 ` Daniel Vetter
2022-11-16 10:26 ` [PATCH mm-unstable v1 15/20] media: pci/ivtv: " David Hildenbrand
2022-11-23 13:18 ` Hans Verkuil
2022-11-16 10:26 ` [PATCH mm-unstable v1 16/20] mm/frame-vector: " David Hildenbrand
2022-11-16 10:50 ` Daniel Vetter
2022-11-23 13:26 ` Hans Verkuil
2022-11-23 14:28 ` Hans Verkuil
2022-11-27 10:35 ` David Hildenbrand
2022-11-28 8:17 ` Hans Verkuil
2022-11-28 8:18 ` David Hildenbrand
2022-11-28 8:26 ` Hans Verkuil
2022-11-28 8:57 ` Tomasz Figa
2022-11-28 22:59 ` Andrew Morton
2022-11-29 8:48 ` David Hildenbrand
2022-11-29 9:08 ` Hans Verkuil
2022-11-29 9:15 ` David Hildenbrand
2022-11-16 10:26 ` [PATCH mm-unstable v1 17/20] drm/exynos: " David Hildenbrand
2022-11-16 10:50 ` Daniel Vetter
2022-11-16 10:26 ` [PATCH mm-unstable v1 18/20] RDMA/hw/qib/qib_user_pages: " David Hildenbrand
2022-11-16 10:26 ` [PATCH mm-unstable v1 19/20] habanalabs: " David Hildenbrand
2022-11-16 10:26 ` [PATCH mm-unstable v1 20/20] mm: rename FOLL_FORCE to FOLL_PTRACE David Hildenbrand
2022-11-16 18:16 ` Linus Torvalds
2022-11-16 18:53 ` David Hildenbrand
2022-11-17 22:58 ` Kees Cook
2022-11-17 23:20 ` Linus Torvalds
2022-11-18 0:31 ` Kees Cook
2022-11-18 11:09 ` Peter Zijlstra
2022-11-18 22:29 ` Kees Cook
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221116102659.70287-1-david@redhat.com \
--to=david@redhat.com \
--cc=aarcange@redhat.com \
--cc=acme@kernel.org \
--cc=airlied@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=alex.williamson@redhat.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=anton.ivanov@cambridgegreys.com \
--cc=arnd@arndb.de \
--cc=awalls@md.metrocast.net \
--cc=benve@cisco.com \
--cc=bmt@zurich.ibm.com \
--cc=bp@alien8.de \
--cc=catalin.marinas@arm.com \
--cc=christian.gmeiner@gmail.com \
--cc=christophe.leroy@csgroup.eu \
--cc=daniel.vetter@ffwll.ch \
--cc=daniel@ffwll.ch \
--cc=dave.hansen@linux.intel.com \
--cc=davem@davemloft.net \
--cc=dennis.dalessandro@cornelisnetworks.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=ebiederm@xmission.com \
--cc=etnaviv@lists.freedesktop.org \
--cc=gregkh@linuxfoundation.org \
--cc=hch@infradead.org \
--cc=hpa@zytor.com \
--cc=hughd@google.com \
--cc=hverkuil@xs4all.nl \
--cc=ink@jurassic.park.msu.ru \
--cc=inki.dae@samsung.com \
--cc=jgg@ziepe.ca \
--cc=jhubbard@nvidia.com \
--cc=jmorris@namei.org \
--cc=johannes@sipsolutions.net \
--cc=jolsa@kernel.org \
--cc=keescook@chromium.org \
--cc=krzysztof.kozlowski@linaro.org \
--cc=kyungmin.park@samsung.com \
--cc=l.stach@pengutronix.de \
--cc=leon@kernel.org \
--cc=leonro@nvidia.com \
--cc=linux+etnaviv@armlinux.org.uk \
--cc=linux-alpha@vger.kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-ia64@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-media@vger.kernel.org \
--cc=linux-mips@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=linux-samsung-soc@vger.kernel.org \
--cc=linux-security-module@vger.kernel.org \
--cc=linux-um@lists.infradead.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=m.szyprowski@samsung.com \
--cc=mark.rutland@arm.com \
--cc=mattst88@gmail.com \
--cc=mchehab@kernel.org \
--cc=mike.kravetz@oracle.com \
--cc=mingo@redhat.com \
--cc=mpe@ellerman.id.au \
--cc=namhyung@kernel.org \
--cc=namit@vmware.com \
--cc=neescoba@cisco.com \
--cc=npiggin@gmail.com \
--cc=ogabbay@kernel.org \
--cc=oleg@redhat.com \
--cc=paul@paul-moore.com \
--cc=penguin-kernel@I-love.SAKURA.ne.jp \
--cc=peterx@redhat.com \
--cc=peterz@infradead.org \
--cc=richard.henderson@linaro.org \
--cc=richard@nod.at \
--cc=serge@hallyn.com \
--cc=shuah@kernel.org \
--cc=songmuchun@bytedance.com \
--cc=sparclinux@vger.kernel.org \
--cc=sw0312.kim@samsung.com \
--cc=takedakn@nttdata.co.jp \
--cc=tfiga@chromium.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=tsbogend@alpha.franken.de \
--cc=vbabka@suse.cz \
--cc=viro@zeniv.linux.org.uk \
--cc=will@kernel.org \
--cc=willy@infradead.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).