All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Jason Gunthorpe <jgg@ziepe.ca>, Jens Axboe <axboe@kernel.dk>,
	Matthew Wilcox <willy@infradead.org>,
	Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>,
	Leon Romanovsky <leon@kernel.org>,
	Christian Benvenuti <benve@cisco.com>,
	Nelson Escobar <neescoba@cisco.com>,
	Bernard Metzler <bmt@zurich.ibm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>,
	Ian Rogers <irogers@google.com>,
	Adrian Hunter <adrian.hunter@intel.com>,
	Bjorn Topel <bjorn@kernel.org>,
	Magnus Karlsson <magnus.karlsson@intel.com>,
	Maciej Fijalkowski <maciej.fijalkowski@intel.com>,
	Jonathan Lemon <jonathan.lemon@gmail.com>,
	"David S . Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Christian Brauner <brauner@kernel.org>,
	Richard Cochran <richardcochran@gmail.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	John Fastabend <john.fastabend@gmail.com>,
	linux-fsdevel@vger.kernel.org, linux-perf-users@vger.kernel.org,
	netdev@vger.kernel.org, bpf@vger.kernel.org,
	Oleg Nesterov <oleg@redhat.com>, Jason Gunthorpe <jgg@nvidia.com>,
	John Hubbard <jhubbard@nvidia.com>, Jan Kara <jack@suse.cz>,
	"Kirill A . Shutemov" <kirill@shutemov.name>,
	Pavel Begunkov <asml.silence@gmail.com>,
	Mika Penttila <mpenttil@redhat.com>,
	Dave Chinner <david@fromorbit.com>, Theodore Ts'o <tytso@mit.edu>,
	Peter Xu <peterx@redhat.com>,
	Matthew Rosato <mjrosato@linux.ibm.com>,
	"Paul E . McKenney" <paulmck@kernel.org>,
	Christian Borntraeger <borntraeger@linux.ibm.com>
Subject: Re: [PATCH v7 1/3] mm/mmap: separate writenotify and dirty tracking logic
Date: Tue, 2 May 2023 19:16:36 +0200	[thread overview]
Message-ID: <03e591ce-debc-bba1-c55e-ce590cc1f38d@redhat.com> (raw)
In-Reply-To: <bf04a98a-9de6-4532-a36c-59572d22dd7c@lucifer.local>

On 02.05.23 19:09, Lorenzo Stoakes wrote:
> On Tue, May 02, 2023 at 05:53:46PM +0100, Lorenzo Stoakes wrote:
>> On Tue, May 02, 2023 at 06:38:53PM +0200, David Hildenbrand wrote:
>>> On 02.05.23 18:34, Lorenzo Stoakes wrote:
>>>> vma_wants_writenotify() is specifically intended for setting PTE page table
>>>> flags, accounting for existing PTE flag state and whether that might
>>>> already be read-only while mixing this check with a check whether the
>>>> filesystem performs dirty tracking.
>>>>
>>>> Separate out the notions of dirty tracking and a PTE write notify checking
>>>> in order that we can invoke the dirty tracking check from elsewhere.
>>>>
>>>> Note that this change introduces a very small duplicate check of the
>>>> separated out vm_ops_needs_writenotify(). This is necessary to avoid making
>>>> vma_needs_dirty_tracking() needlessly complicated (e.g. passing a
>>>> check_writenotify flag or having it assume this check was already
>>>> performed). This is such a small check that it doesn't seem too egregious
>>>> to do this.
>>>>
>>>> Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
>>>> Reviewed-by: John Hubbard <jhubbard@nvidia.com>
>>>> Reviewed-by: Mika Penttilä <mpenttil@redhat.com>
>>>> Reviewed-by: Jan Kara <jack@suse.cz>
>>>> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
>>>> ---
>>>>    include/linux/mm.h |  1 +
>>>>    mm/mmap.c          | 36 +++++++++++++++++++++++++++---------
>>>>    2 files changed, 28 insertions(+), 9 deletions(-)
>>>>
>>>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>>>> index 27ce77080c79..7b1d4e7393ef 100644
>>>> --- a/include/linux/mm.h
>>>> +++ b/include/linux/mm.h
>>>> @@ -2422,6 +2422,7 @@ extern unsigned long move_page_tables(struct vm_area_struct *vma,
>>>>    #define  MM_CP_UFFD_WP_ALL                 (MM_CP_UFFD_WP | \
>>>>    					    MM_CP_UFFD_WP_RESOLVE)
>>>> +bool vma_needs_dirty_tracking(struct vm_area_struct *vma);
>>>>    int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot);
>>>>    static inline bool vma_wants_manual_pte_write_upgrade(struct vm_area_struct *vma)
>>>>    {
>>>> diff --git a/mm/mmap.c b/mm/mmap.c
>>>> index 5522130ae606..295c5f2e9bd9 100644
>>>> --- a/mm/mmap.c
>>>> +++ b/mm/mmap.c
>>>> @@ -1475,6 +1475,31 @@ SYSCALL_DEFINE1(old_mmap, struct mmap_arg_struct __user *, arg)
>>>>    }
>>>>    #endif /* __ARCH_WANT_SYS_OLD_MMAP */
>>>> +/* Do VMA operations imply write notify is required? */
>>>> +static bool vm_ops_needs_writenotify(const struct vm_operations_struct *vm_ops)
>>>> +{
>>>> +	return vm_ops && (vm_ops->page_mkwrite || vm_ops->pfn_mkwrite);
>>>> +}
>>>> +
>>>> +/*
>>>> + * Does this VMA require the underlying folios to have their dirty state
>>>> + * tracked?
>>>> + */
>>>> +bool vma_needs_dirty_tracking(struct vm_area_struct *vma)
>>>> +{
>>>
>>> Sorry for not noticing this earlier, but ...
>>
>> pints_owed++

Having tired eyes and jumping back and forth between tasks really seems 
to start getting expensive ;)

>>
>>>
>>> what about MAP_PRIVATE mappings? When we write, we populate an anon page,
>>> which will work as expected ... because we don't have to notify the fs?
>>>
>>> I think you really also want the "If it was private or non-writable, the
>>> write bit is already clear */" part as well and remove "false" in that case.
>>>
>>
>> Not sure a 'write bit is already clear' case is relevant to checking
>> whether a filesystem dirty tracks? That seems specific entirely to the page
>> table bits.
>>
>> That's why I didn't include it,
>>
>> A !VM_WRITE shouldn't be GUP-writable except for FOLL_FORCE, and that
>> surely could be problematic if VM_MAYWRITE later?
>>
>> Thinking about it though a !VM_SHARE should probably can be safely assumed
>> to not be dirty-trackable, so we probably do need to add a check for
>> !VM_SHARED -> !vma_needs_dirty_tracking
>>
> 
> On second thoughts, we explicitly check FOLL_FORCE && !is_cow_mapping() in
> check_vma_flags() so that case cannot occur.
> 
> So actually yes we should probably include this on the basis of that and
> the fact that a FOLL_WRITE operation will CoW the MAP_PRIVATE mapping.
> 

Yes, we only allow to FOLL_FORCE write to (exclusive) anonymous pages 
that are mapped read-only. If it's not that, we trigger a (fake) write 
fault.


-- 
Thanks,

David / dhildenb


  reply	other threads:[~2023-05-02 17:17 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-02 16:34 [PATCH v7 0/3] mm/gup: disallow GUP writing to file-backed mappings by default Lorenzo Stoakes
2023-05-02 16:34 ` [PATCH v7 1/3] mm/mmap: separate writenotify and dirty tracking logic Lorenzo Stoakes
2023-05-02 16:38   ` David Hildenbrand
2023-05-02 16:53     ` Lorenzo Stoakes
2023-05-02 17:09       ` Lorenzo Stoakes
2023-05-02 17:16         ` David Hildenbrand [this message]
2023-05-02 16:34 ` [PATCH v7 2/3] mm/gup: disallow FOLL_LONGTERM GUP-nonfast writing to file-backed mappings Lorenzo Stoakes
2023-05-02 16:42   ` David Hildenbrand
2023-05-02 16:56     ` Lorenzo Stoakes
2023-05-02 16:34 ` [PATCH v7 3/3] mm/gup: disallow FOLL_LONGTERM GUP-fast " Lorenzo Stoakes
2023-05-02 17:13   ` David Hildenbrand
2023-05-02 17:22     ` Peter Zijlstra
2023-05-02 17:34       ` David Hildenbrand
2023-05-02 18:17         ` Lorenzo Stoakes
2023-05-02 19:07           ` David Hildenbrand
2023-05-02 19:07           ` Jason Gunthorpe
2023-05-02 19:25             ` Lorenzo Stoakes
2023-05-02 19:33               ` David Hildenbrand
2023-05-02 19:37                 ` Lorenzo Stoakes
2023-05-02 23:51                 ` Jason Gunthorpe
2023-05-03  0:22               ` Jason Gunthorpe
2023-05-02 18:59         ` Peter Zijlstra
2023-05-02 19:05           ` David Hildenbrand
2023-05-02 17:34       ` Lorenzo Stoakes
2023-05-02 17:31     ` Lorenzo Stoakes
2023-05-02 17:38       ` David Hildenbrand
2023-05-02 17:45         ` Lorenzo Stoakes
2023-05-02 19:17   ` David Hildenbrand
2023-05-02 19:45     ` Lorenzo Stoakes
2023-05-02 18:45 ` [PATCH v7 0/3] mm/gup: disallow GUP writing to file-backed mappings by default Matthew Rosato
2023-05-02 18:53   ` Lorenzo Stoakes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=03e591ce-debc-bba1-c55e-ce590cc1f38d@redhat.com \
    --to=david@redhat.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=asml.silence@gmail.com \
    --cc=ast@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=benve@cisco.com \
    --cc=bjorn@kernel.org \
    --cc=bmt@zurich.ibm.com \
    --cc=borntraeger@linux.ibm.com \
    --cc=bpf@vger.kernel.org \
    --cc=brauner@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=david@fromorbit.com \
    --cc=dennis.dalessandro@cornelisnetworks.com \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=irogers@google.com \
    --cc=jack@suse.cz \
    --cc=jgg@nvidia.com \
    --cc=jgg@ziepe.ca \
    --cc=jhubbard@nvidia.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=jonathan.lemon@gmail.com \
    --cc=kirill@shutemov.name \
    --cc=kuba@kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=lstoakes@gmail.com \
    --cc=maciej.fijalkowski@intel.com \
    --cc=magnus.karlsson@intel.com \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=mjrosato@linux.ibm.com \
    --cc=mpenttil@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=neescoba@cisco.com \
    --cc=netdev@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=pabeni@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterx@redhat.com \
    --cc=peterz@infradead.org \
    --cc=richardcochran@gmail.com \
    --cc=tytso@mit.edu \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.