From: Ryan Roberts <ryan.roberts@arm.com>
To: Andrew Morton <akpm@linux-foundation.org>,
"Matthew Wilcox (Oracle)" <willy@infradead.org>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
SeongJae Park <sj@kernel.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, damon@lists.linux.dev
Subject: Re:
Date: Thu, 11 May 2023 14:13:59 +0100 [thread overview]
Message-ID: <e73130ff-f583-5382-8c7a-8d608dd26db6@arm.com> (raw)
In-Reply-To: <20230511125848.78621-1-ryan.roberts@arm.com>
My appologies for the noise: A blank line between Cc and Subject has broken the
subject and grouping in lore.
Please Ignore this, I will resend.
On 11/05/2023 13:58, Ryan Roberts wrote:
> Date: Thu, 11 May 2023 11:38:28 +0100
> Subject: [PATCH v1 0/5] Encapsulate PTE contents from non-arch code
>
> Hi All,
>
> This series improves the encapsulation of pte entries by disallowing non-arch
> code from directly dereferencing pte_t pointers. Instead code must use a new
> helper, `pte_t ptep_deref(pte_t *ptep)`. By default, this helper does a direct
> dereference of the pointer, so generated code should be exactly the same. But
> it's presence sets us up for arch code being able to override the default to
> "virtualize" the ptes without needing to maintain a shadow table.
>
> I intend to take advantage of this for arm64 to enable use of its "contiguous
> bit" to coalesce multiple ptes into a single tlb entry, reducing pressure and
> improving performance. I have an RFC for the first part of this work at [1]. The
> cover letter there also explains the second part, which this series is enabling.
>
> I intend to post an RFC for the contpte changes in due course, but it would be
> good to get the ball rolling on this enabler.
>
> There are 2 reasons that I need the encapsulation:
>
> - Prevent leaking the arch-private PTE_CONT bit to the core code. If the core
> code reads a pte that contains this bit, it could end up calling
> set_pte_at() with the bit set which would confuse the implementation. So we
> can always clear PTE_CONT in ptep_deref() (and ptep_get()) to avoid a leaky
> abstraction.
> - Contiguous ptes have a single access and dirty bit for the contiguous range.
> So we need to "mix-in" those bits when the core is dereferencing a pte that
> lies in the contig range. There is code that dereferences the pte then takes
> different actions based on access/dirty (see e.g. write_protect_page()).
>
> While ptep_get() and ptep_get_lockless() already exist, both of them are
> implemented using READ_ONCE() by default. While we could use ptep_get() instead
> of the new ptep_deref(), I didn't want to risk performance regression.
> Alternatively, all call sites that currently use ptep_get() that need the
> lockless behaviour could be upgraded to ptep_get_lockless() and ptep_get() could
> be downgraded to a simple dereference. That would be cleanest, but is a much
> bigger (and likely error prone) change because all the arch code would need to
> be updated for the new definitions of ptep_get().
>
> The series is split up as follows:
>
> patchs 1-2: Fix bugs where code was _setting_ ptes directly, rather than using
> set_pte_at() and friends.
> patch 3: Fix highmem unmapping issue I spotted while doing the work.
> patch 4: Introduce the new ptep_deref() helper with default implementation.
> patch 5: Convert all direct dereferences to use ptep_deref().
>
> [1] https://lore.kernel.org/linux-mm/20230414130303.2345383-1-ryan.roberts@arm.com/
>
> Thanks,
> Ryan
>
>
> Ryan Roberts (5):
> mm: vmalloc must set pte via arch code
> mm: damon must atomically clear young on ptes and pmds
> mm: Fix failure to unmap pte on highmem systems
> mm: Add new ptep_deref() helper to fully encapsulate pte_t
> mm: ptep_deref() conversion
>
> .../drm/i915/gem/selftests/i915_gem_mman.c | 8 +-
> drivers/misc/sgi-gru/grufault.c | 2 +-
> drivers/vfio/vfio_iommu_type1.c | 7 +-
> drivers/xen/privcmd.c | 2 +-
> fs/proc/task_mmu.c | 33 +++---
> fs/userfaultfd.c | 6 +-
> include/linux/hugetlb.h | 2 +-
> include/linux/mm_inline.h | 2 +-
> include/linux/pgtable.h | 13 ++-
> kernel/events/uprobes.c | 2 +-
> mm/damon/ops-common.c | 18 ++-
> mm/damon/ops-common.h | 4 +-
> mm/damon/paddr.c | 6 +-
> mm/damon/vaddr.c | 14 ++-
> mm/filemap.c | 2 +-
> mm/gup.c | 21 ++--
> mm/highmem.c | 12 +-
> mm/hmm.c | 2 +-
> mm/huge_memory.c | 4 +-
> mm/hugetlb.c | 2 +-
> mm/hugetlb_vmemmap.c | 6 +-
> mm/kasan/init.c | 9 +-
> mm/kasan/shadow.c | 10 +-
> mm/khugepaged.c | 24 ++--
> mm/ksm.c | 22 ++--
> mm/madvise.c | 6 +-
> mm/mapping_dirty_helpers.c | 4 +-
> mm/memcontrol.c | 4 +-
> mm/memory-failure.c | 6 +-
> mm/memory.c | 103 +++++++++---------
> mm/mempolicy.c | 6 +-
> mm/migrate.c | 14 ++-
> mm/migrate_device.c | 14 ++-
> mm/mincore.c | 2 +-
> mm/mlock.c | 6 +-
> mm/mprotect.c | 8 +-
> mm/mremap.c | 2 +-
> mm/page_table_check.c | 4 +-
> mm/page_vma_mapped.c | 26 +++--
> mm/pgtable-generic.c | 2 +-
> mm/rmap.c | 32 +++---
> mm/sparse-vmemmap.c | 8 +-
> mm/swap_state.c | 4 +-
> mm/swapfile.c | 16 +--
> mm/userfaultfd.c | 4 +-
> mm/vmalloc.c | 11 +-
> mm/vmscan.c | 14 ++-
> virt/kvm/kvm_main.c | 9 +-
> 48 files changed, 302 insertions(+), 236 deletions(-)
>
> --
> 2.25.1
>
next prev parent reply other threads:[~2023-05-11 13:14 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-11 12:58 Ryan Roberts
2023-05-11 12:58 ` [PATCH v1 1/5] mm: vmalloc must set pte via arch code Ryan Roberts
2023-05-11 13:14 ` Ryan Roberts
2023-05-11 12:58 ` [PATCH v1 2/5] mm: damon must atomically clear young on ptes and pmds Ryan Roberts
2023-05-11 13:14 ` Ryan Roberts
2023-05-11 12:58 ` [PATCH v1 3/5] mm: Fix failure to unmap pte on highmem systems Ryan Roberts
2023-05-11 13:14 ` Ryan Roberts
2023-05-11 12:58 ` [PATCH v1 4/5] mm: Add new ptep_deref() helper to fully encapsulate pte_t Ryan Roberts
2023-05-11 13:14 ` Ryan Roberts
2023-05-11 12:58 ` [PATCH v1 5/5] mm: ptep_deref() conversion Ryan Roberts
2023-05-11 13:14 ` Ryan Roberts
2023-05-12 10:34 ` kernel test robot
2023-05-11 13:13 ` Ryan Roberts [this message]
-- strict thread matches above, loose matches on Subject: below --
2024-02-27 17:42 [PATCH v3 00/18] Rearrange batched folio freeing Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 10/18] mm: Allow non-hugetlb large folios to be batch processed Matthew Wilcox (Oracle)
2024-03-06 13:42 ` Ryan Roberts
2024-03-06 16:09 ` Matthew Wilcox
2024-03-06 16:19 ` Ryan Roberts
2024-03-06 17:41 ` Ryan Roberts
2024-03-06 18:41 ` Zi Yan
2024-03-06 19:55 ` Matthew Wilcox
2024-03-06 21:55 ` Matthew Wilcox
2024-03-07 8:56 ` Ryan Roberts
2024-03-07 13:50 ` Yin, Fengwei
2024-03-07 14:05 ` Re: Matthew Wilcox
2024-03-07 15:24 ` Re: Ryan Roberts
2024-03-07 16:24 ` Re: Ryan Roberts
2024-03-07 23:02 ` Re: Matthew Wilcox
2024-03-08 1:06 ` Re: Yin, Fengwei
2024-01-18 22:19 [RFC] [PATCH 0/3] xfs: use large folios for buffers Dave Chinner
2024-01-22 10:13 ` Andi Kleen
2024-01-22 11:53 ` Dave Chinner
2022-08-26 22:03 Zach O'Keefe
2022-08-31 21:47 ` Yang Shi
2022-09-01 0:24 ` Re: Zach O'Keefe
[not found] <20220421164138.1250943-1-yury.norov@gmail.com>
2022-04-21 23:04 ` Re: John Hubbard
2022-04-21 23:09 ` Re: John Hubbard
2022-04-21 23:17 ` Re: Yury Norov
2022-04-21 23:21 ` Re: John Hubbard
2021-08-12 9:21 Valdis Klētnieks
2021-08-12 9:42 ` SeongJae Park
2021-08-12 20:19 ` Re: Andrew Morton
2021-08-13 8:14 ` Re: SeongJae Park
[not found] <bug-200209-27@https.bugzilla.kernel.org/>
2018-06-28 3:48 ` Andrew Morton
2018-06-29 18:44 ` Andrey Ryabinin
2018-06-24 20:09 [PATCH 3/3] mm: list_lru: Add lock_irq member to __list_lru_init() Vladimir Davydov
2018-07-03 14:52 ` Sebastian Andrzej Siewior
2018-07-03 21:14 ` Andrew Morton
2018-07-03 21:44 ` Re: Sebastian Andrzej Siewior
2018-07-04 14:44 ` Re: Vladimir Davydov
2007-04-03 18:41 Royal VIP Casino
[not found] <F265RQAOCop3wyv9kI3000143b1@hotmail.com>
2001-10-08 11:48 ` Joseph A Knapka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e73130ff-f583-5382-8c7a-8d608dd26db6@arm.com \
--to=ryan.roberts@arm.com \
--cc=akpm@linux-foundation.org \
--cc=damon@lists.linux.dev \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=sj@kernel.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).