From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 27AB0C7EE22 for ; Thu, 11 May 2023 13:14:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 887176B007B; Thu, 11 May 2023 09:14:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8370E6B007D; Thu, 11 May 2023 09:14:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7269A6B007E; Thu, 11 May 2023 09:14:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 61BB16B007B for ; Thu, 11 May 2023 09:14:51 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 9526D80C7D for ; Thu, 11 May 2023 13:14:50 +0000 (UTC) X-FDA: 80778019140.13.D9685A6 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf27.hostedemail.com (Postfix) with ESMTP id D58494018D for ; Thu, 11 May 2023 13:14:02 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=none; spf=pass (imf27.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1683810843; a=rsa-sha256; cv=none; b=hSFVW8+Ks7icNyqGF6tYIa8pU40Bohe1MQMHFGdqxm0cOAJJDOmT7R7NxFYNBU8CBVtpro CNSTku9NEbPibmMpMvIYU4uYA6LnG0Y9dfxX/HNEsLNRPA4a6JCh0KVyQZitez/U45y9t4 edj31uLXcXShg4omqrfE9INYkralGl8= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=none; spf=pass (imf27.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1683810843; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yBGG7o2HIOkbwPBwI2jXhyWYLnJt3l4AylxMJgCDN7A=; b=Skp9lNlzuwIyCZqsKyC5VR5I3UHh3qpiL7JjHv/JmOyFdR4eV5fpXRjYEcBX3xqG7SmZdv XP1c2OehwK+7xxoInezkjNofGdSGhLVqNzcewBTnMe/mcyEerSwsSeeGipYOvNXVSKH8HI Fm+nLjo87JL0DhA97vDnMs56WePU6Pw= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 62463113E; Thu, 11 May 2023 06:14:46 -0700 (PDT) Received: from [10.1.34.59] (C02Z41KALVDN.cambridge.arm.com [10.1.34.59]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id E80413F5A1; Thu, 11 May 2023 06:14:00 -0700 (PDT) Message-ID: Date: Thu, 11 May 2023 14:13:59 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.10.1 Subject: Re: To: Andrew Morton , "Matthew Wilcox (Oracle)" , "Kirill A. Shutemov" , SeongJae Park Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, damon@lists.linux.dev References: <20230511125848.78621-1-ryan.roberts@arm.com> From: Ryan Roberts In-Reply-To: <20230511125848.78621-1-ryan.roberts@arm.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: D58494018D X-Stat-Signature: p9o7spp98eubgp464du45oknzf3r9np8 X-HE-Tag: 1683810842-391369 X-HE-Meta: U2FsdGVkX1+hEX8CWF+yTJMhBVqTq99nWQXr+gscDwv00UxAzRIqyPaPMeeGM0gHCtJ5T1QpszCf4umGIpqzLkantdkI2/2UEXsOKrwvhL4YXKFc2+164786Iy61gterswmVR1FkAMj9u5jpl+guHPsJmcPB99G3dJboBZr6qeZVGrO8sI1/EyFhQrZovkMXHEWHHIsIxy05Rk29WAmES5ex8SyaX9yV7H4SzcVc9vG7696tdOaDzZGV+l/z59hh0ALRpx/vKP0IRy1tC7BW8nIabtTGtWADERaqmV/3c9JfNpkhq15AaIi57ScUaVvNEbGqeHZRr1legbIQh9FzQ2YWHbnZ8KpXE9L0FPd1PArbAezqU/Aj6HFwRoejjlvszEQXw24fwi0Dc/DbANwNJkqfmX2MZj+1bkbG49gCa+tM4/6sMV18xLwdt98PXwfKnhNTvkFWgzJYc+B63t8itKYG5LVrX3V9glmtU4qLnZK7uPT3sk9CfuMVk+TH19aZKFnrD7VE3kvFUC+eTPVj0eK2jOZwkhgl3JfTMnxpf4DBOdfjXXaIetiP5NUYOwWlcLS6nXI06SN/AyleWavptZG+66NcWbfoGiMhos7NstbkIZuy1frI3tT4U+Fz0gwz3ZEPqVKkSqkOLm0geQZX54gyxrB3GfCIca7wSL+7NlbXfV+wyVLw9VCCmr7lVweHIEohQusEWFo+Z9U/eoZianmlRHnn8D7kyBFYeAW8dXvOUf+K7nnyudpUugxOEgQKzvTJOs+a9McbmvWntZ14iaKSL4pp2rAAYbLkmFqS5s2TNUYoVADV6bDC4TQ9+/cwrskRkUu1jRgJ0vKsESeN9uOVN+FOF95jKe1ikQ6JVXiMsQzuayTfw2Hl7HeOkjEkhV5Zw2iNFzWRgc7tUIgty0tCJdmc0k9xxy/zehHL2RVUxJKIBU4dCB9GIn4gsnZspheJy3Ex5fSW0T8TwhC CJvxRHCC DX6viC4KAdWHix/IyRAM91mcuHegvDCsHhMuuf+uoLDAQmPa9R4YNDzc+HStBRg/bBzzIeA3M77RS6m/84CLbkxmND2/dCLEPSswXmDo+itUfBDd1N1Dk3JN23VwgG1/JtsFLQxvk18nXKxczLyjNhc1jOUAsfzQX/9tym5fB4XWQoujixjVdEpw2mUx4DEtVWAwsH9efUkKgFo/Iop2Ce9QNvHKM0gFG4wyu X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: My appologies for the noise: A blank line between Cc and Subject has broken the subject and grouping in lore. Please Ignore this, I will resend. On 11/05/2023 13:58, Ryan Roberts wrote: > Date: Thu, 11 May 2023 11:38:28 +0100 > Subject: [PATCH v1 0/5] Encapsulate PTE contents from non-arch code > > Hi All, > > This series improves the encapsulation of pte entries by disallowing non-arch > code from directly dereferencing pte_t pointers. Instead code must use a new > helper, `pte_t ptep_deref(pte_t *ptep)`. By default, this helper does a direct > dereference of the pointer, so generated code should be exactly the same. But > it's presence sets us up for arch code being able to override the default to > "virtualize" the ptes without needing to maintain a shadow table. > > I intend to take advantage of this for arm64 to enable use of its "contiguous > bit" to coalesce multiple ptes into a single tlb entry, reducing pressure and > improving performance. I have an RFC for the first part of this work at [1]. The > cover letter there also explains the second part, which this series is enabling. > > I intend to post an RFC for the contpte changes in due course, but it would be > good to get the ball rolling on this enabler. > > There are 2 reasons that I need the encapsulation: > > - Prevent leaking the arch-private PTE_CONT bit to the core code. If the core > code reads a pte that contains this bit, it could end up calling > set_pte_at() with the bit set which would confuse the implementation. So we > can always clear PTE_CONT in ptep_deref() (and ptep_get()) to avoid a leaky > abstraction. > - Contiguous ptes have a single access and dirty bit for the contiguous range. > So we need to "mix-in" those bits when the core is dereferencing a pte that > lies in the contig range. There is code that dereferences the pte then takes > different actions based on access/dirty (see e.g. write_protect_page()). > > While ptep_get() and ptep_get_lockless() already exist, both of them are > implemented using READ_ONCE() by default. While we could use ptep_get() instead > of the new ptep_deref(), I didn't want to risk performance regression. > Alternatively, all call sites that currently use ptep_get() that need the > lockless behaviour could be upgraded to ptep_get_lockless() and ptep_get() could > be downgraded to a simple dereference. That would be cleanest, but is a much > bigger (and likely error prone) change because all the arch code would need to > be updated for the new definitions of ptep_get(). > > The series is split up as follows: > > patchs 1-2: Fix bugs where code was _setting_ ptes directly, rather than using > set_pte_at() and friends. > patch 3: Fix highmem unmapping issue I spotted while doing the work. > patch 4: Introduce the new ptep_deref() helper with default implementation. > patch 5: Convert all direct dereferences to use ptep_deref(). > > [1] https://lore.kernel.org/linux-mm/20230414130303.2345383-1-ryan.roberts@arm.com/ > > Thanks, > Ryan > > > Ryan Roberts (5): > mm: vmalloc must set pte via arch code > mm: damon must atomically clear young on ptes and pmds > mm: Fix failure to unmap pte on highmem systems > mm: Add new ptep_deref() helper to fully encapsulate pte_t > mm: ptep_deref() conversion > > .../drm/i915/gem/selftests/i915_gem_mman.c | 8 +- > drivers/misc/sgi-gru/grufault.c | 2 +- > drivers/vfio/vfio_iommu_type1.c | 7 +- > drivers/xen/privcmd.c | 2 +- > fs/proc/task_mmu.c | 33 +++--- > fs/userfaultfd.c | 6 +- > include/linux/hugetlb.h | 2 +- > include/linux/mm_inline.h | 2 +- > include/linux/pgtable.h | 13 ++- > kernel/events/uprobes.c | 2 +- > mm/damon/ops-common.c | 18 ++- > mm/damon/ops-common.h | 4 +- > mm/damon/paddr.c | 6 +- > mm/damon/vaddr.c | 14 ++- > mm/filemap.c | 2 +- > mm/gup.c | 21 ++-- > mm/highmem.c | 12 +- > mm/hmm.c | 2 +- > mm/huge_memory.c | 4 +- > mm/hugetlb.c | 2 +- > mm/hugetlb_vmemmap.c | 6 +- > mm/kasan/init.c | 9 +- > mm/kasan/shadow.c | 10 +- > mm/khugepaged.c | 24 ++-- > mm/ksm.c | 22 ++-- > mm/madvise.c | 6 +- > mm/mapping_dirty_helpers.c | 4 +- > mm/memcontrol.c | 4 +- > mm/memory-failure.c | 6 +- > mm/memory.c | 103 +++++++++--------- > mm/mempolicy.c | 6 +- > mm/migrate.c | 14 ++- > mm/migrate_device.c | 14 ++- > mm/mincore.c | 2 +- > mm/mlock.c | 6 +- > mm/mprotect.c | 8 +- > mm/mremap.c | 2 +- > mm/page_table_check.c | 4 +- > mm/page_vma_mapped.c | 26 +++-- > mm/pgtable-generic.c | 2 +- > mm/rmap.c | 32 +++--- > mm/sparse-vmemmap.c | 8 +- > mm/swap_state.c | 4 +- > mm/swapfile.c | 16 +-- > mm/userfaultfd.c | 4 +- > mm/vmalloc.c | 11 +- > mm/vmscan.c | 14 ++- > virt/kvm/kvm_main.c | 9 +- > 48 files changed, 302 insertions(+), 236 deletions(-) > > -- > 2.25.1 >