linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
To: David Hildenbrand <david@redhat.com>
Cc: linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Hugh Dickins <hughd@google.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	David Rientjes <rientjes@google.com>,
	Shakeel Butt <shakeelb@google.com>,
	John Hubbard <jhubbard@nvidia.com>,
	Jason Gunthorpe <jgg@nvidia.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Mike Rapoport <rppt@linux.ibm.com>,
	Yang Shi <shy828301@gmail.com>,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	Matthew Wilcox <willy@infradead.org>,
	Vlastimil Babka <vbabka@suse.cz>, Jann Horn <jannh@google.com>,
	Michal Hocko <mhocko@kernel.org>, Nadav Amit <namit@vmware.com>,
	Rik van Riel <riel@surriel.com>, Roman Gushchin <guro@fb.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Peter Xu <peterx@redhat.com>, Donald Dutile <ddutile@redhat.com>,
	Christoph Hellwig <hch@lst.de>, Oleg Nesterov <oleg@redhat.com>,
	Jan Kara <jack@suse.cz>, Liang Zhang <zhangliang5@huawei.com>,
	Pedro Gomes <pedrodemargomes@gmail.com>,
	Oded Gabbay <oded.gabbay@gmail.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Paul Mackerras <paulus@samba.org>,
	Heiko Carstens <hca@linux.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>,
	Alexander Gordeev <agordeev@linux.ibm.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	linux-mm@kvack.org, x86@kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org
Subject: Re: [PATCH v1 5/7] s390/pgtable: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE
Date: Tue, 15 Mar 2022 17:21:02 +0100	[thread overview]
Message-ID: <20220315172102.771bd2cf@thinkpad> (raw)
In-Reply-To: <20220315141837.137118-6-david@redhat.com>

On Tue, 15 Mar 2022 15:18:35 +0100
David Hildenbrand <david@redhat.com> wrote:

> Let's steal one bit from the offset. While at it, document the meaning
> of bit 62 for swap ptes.

You define _PAGE_SWP_EXCLUSIVE as _PAGE_LARGE, which is bit 52, and
this is not part of the swap pte offset IIUC. So stealing any bit might
actually not be necessary, see below.

Also, bit 62 should be the soft dirty bit for normal PTEs, and this
doesn't seem to be used for swap PTEs at all. But I might be missing
some use case where softdirty also needs to be preserved in swap PTEs.

> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  arch/s390/include/asm/pgtable.h | 37 ++++++++++++++++++++++++++-------
>  1 file changed, 30 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
> index 008a6c856fa4..c182212a2b44 100644
> --- a/arch/s390/include/asm/pgtable.h
> +++ b/arch/s390/include/asm/pgtable.h
> @@ -181,6 +181,8 @@ static inline int is_module_addr(void *addr)
>  #define _PAGE_SOFT_DIRTY 0x000
>  #endif
>  
> +#define _PAGE_SWP_EXCLUSIVE _PAGE_LARGE	/* SW pte exclusive swap bit */
> +
>  /* Set of bits not changed in pte_modify */
>  #define _PAGE_CHG_MASK		(PAGE_MASK | _PAGE_SPECIAL | _PAGE_DIRTY | \
>  				 _PAGE_YOUNG | _PAGE_SOFT_DIRTY)
> @@ -796,6 +798,24 @@ static inline int pmd_protnone(pmd_t pmd)
>  }
>  #endif
>  
> +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE
> +static inline pte_t pte_swp_mkexclusive(pte_t pte)
> +{
> +	pte_val(pte) |= _PAGE_SWP_EXCLUSIVE;
> +	return pte;
> +}
> +
> +static inline int pte_swp_exclusive(pte_t pte)
> +{
> +	return pte_val(pte) & _PAGE_SWP_EXCLUSIVE;
> +}
> +
> +static inline pte_t pte_swp_clear_exclusive(pte_t pte)
> +{
> +	pte_val(pte) &= ~_PAGE_SWP_EXCLUSIVE;
> +	return pte;
> +}
> +
>  static inline int pte_soft_dirty(pte_t pte)
>  {
>  	return pte_val(pte) & _PAGE_SOFT_DIRTY;
> @@ -1675,16 +1695,19 @@ static inline int has_transparent_hugepage(void)
>   * information in the lowcore.
>   * Bits 54 and 63 are used to indicate the page type.
>   * A swap pte is indicated by bit pattern (pte & 0x201) == 0x200
> - * This leaves the bits 0-51 and bits 56-62 to store type and offset.
> - * We use the 5 bits from 57-61 for the type and the 52 bits from 0-51
> + * This leaves the bits 0-50 and bits 56-61 to store type and offset.
> + * We use the 5 bits from 57-61 for the type and the 51 bits from 0-50
>   * for the offset.
> - * |			  offset			|01100|type |00|
> - * |0000000000111111111122222222223333333333444444444455|55555|55566|66|
> - * |0123456789012345678901234567890123456789012345678901|23456|78901|23|
> + * |			  offset		       |E|01100|type |S0|
> + * |000000000011111111112222222222333333333344444444445|5|55555|55566|66|
> + * |012345678901234567890123456789012345678901234567890|1|23456|78901|23|
> + *
> + * S (bit 62) is used for softdirty tracking.

Unless there is some use for softdirty tracking in swap PTEs, I think
this description does not belong here, to the swap PTE layout.

> + * E (bit 51) is used to remember PG_anon_exclusive.

It is bit 52, at least with this patch, so I guess this could all be
done w/o stealing anything. That is, of course, only if it is allowed
to use bit 52 in this case. The POP says bit 52 has to be 0, or else
a "translation-specification exception" is recognized. However, I think
it could be OK for PTEs marked as invalid, like it is the case for swap
PTEs.

The comment here says at the beginning:
/*
 * 64 bit swap entry format:
 * A page-table entry has some bits we have to treat in a special way.
 * Bits 52 and bit 55 have to be zero, otherwise a specification
 * exception will occur instead of a page translation exception. The
 * specification exception has the bad habit not to store necessary
 * information in the lowcore.

This would mean that it is not OK to have bit 52 not zero for swap PTEs.
But if I read the POP correctly, all bits except for the DAT-protection
would be ignored for invalid PTEs, so maybe this comment needs some update
(for both bits 52 and also 55).

Heiko might also have some more insight.

Anyway, stealing bit 51 might still be an option, but then
_PAGE_SWP_EXCLUSIVE would need to be defined appropriately.


  reply	other threads:[~2022-03-15 16:21 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-15 14:18 [PATCH v1 0/7] mm: COW fixes part 3: reliable GUP R/W FOLL_GET of anonymous pages David Hildenbrand
2022-03-15 14:18 ` [PATCH v1 1/7] mm/swap: remember PG_anon_exclusive via a swp pte bit David Hildenbrand
2022-03-15 14:18 ` [PATCH v1 2/7] mm/debug_vm_pgtable: add tests for __HAVE_ARCH_PTE_SWP_EXCLUSIVE David Hildenbrand
2022-03-15 14:18 ` [PATCH v1 3/7] x86/pgtable: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE David Hildenbrand
2022-03-15 14:18 ` [PATCH v1 4/7] arm64/pgtable: " David Hildenbrand
2022-03-16 18:27   ` Catalin Marinas
2022-03-17 10:04     ` David Hildenbrand
2022-03-17 17:58       ` Catalin Marinas
2022-03-18  9:59         ` David Hildenbrand
2022-03-18 11:33           ` Catalin Marinas
2022-03-18 14:14             ` David Hildenbrand
2022-03-21 14:38     ` Will Deacon
2022-03-21 14:39       ` Will Deacon
2022-03-21 15:07       ` David Hildenbrand
2022-03-21 17:44         ` Will Deacon
2022-03-21 18:27           ` Catalin Marinas
2022-03-22  9:46             ` David Hildenbrand
2022-03-15 14:18 ` [PATCH v1 5/7] s390/pgtable: " David Hildenbrand
2022-03-15 16:21   ` Gerald Schaefer [this message]
2022-03-15 16:37     ` David Hildenbrand
2022-03-15 16:58       ` David Hildenbrand
2022-03-15 17:12         ` David Hildenbrand
2022-03-15 17:14           ` David Hildenbrand
2022-03-16 10:56           ` Gerald Schaefer
2022-03-16 11:06             ` David Hildenbrand
2022-03-16 13:01             ` Christian Borntraeger
2022-03-16 13:27               ` Gerald Schaefer
2022-03-16 14:00                 ` David Hildenbrand
2022-03-15 14:18 ` [PATCH v1 6/7] powerpc/pgtable: remove _PAGE_BIT_SWAP_TYPE for book3s David Hildenbrand
2022-03-15 14:18 ` [PATCH v1 7/7] powerpc/pgtable: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE " David Hildenbrand
2022-03-18 23:48 ` [PATCH v1 0/7] mm: COW fixes part 3: reliable GUP R/W FOLL_GET of anonymous pages Jason Gunthorpe
2022-03-19 11:17   ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220315172102.771bd2cf@thinkpad \
    --to=gerald.schaefer@linux.ibm.com \
    --cc=aarcange@redhat.com \
    --cc=agordeev@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=benh@kernel.crashing.org \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=ddutile@redhat.com \
    --cc=gor@linux.ibm.com \
    --cc=guro@fb.com \
    --cc=hca@linux.ibm.com \
    --cc=hch@lst.de \
    --cc=hughd@google.com \
    --cc=jack@suse.cz \
    --cc=jannh@google.com \
    --cc=jgg@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mhocko@kernel.org \
    --cc=mike.kravetz@oracle.com \
    --cc=mingo@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=namit@vmware.com \
    --cc=oded.gabbay@gmail.com \
    --cc=oleg@redhat.com \
    --cc=paulus@samba.org \
    --cc=pedrodemargomes@gmail.com \
    --cc=peterx@redhat.com \
    --cc=riel@surriel.com \
    --cc=rientjes@google.com \
    --cc=rppt@linux.ibm.com \
    --cc=shakeelb@google.com \
    --cc=shy828301@gmail.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    --cc=zhangliang5@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).