linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: kan.liang@linux.intel.com, mingo@kernel.org, acme@kernel.org,
	mark.rutland@arm.com, alexander.shishkin@linux.intel.com,
	jolsa@redhat.com, eranian@google.com,
	christophe.leroy@csgroup.eu, npiggin@gmail.com,
	linuxppc-dev@lists.ozlabs.org, mpe@ellerman.id.au,
	will@kernel.org, aneesh.kumar@linux.ibm.com,
	sparclinux@vger.kernel.org, davem@davemloft.net,
	catalin.marinas@arm.com, linux-arch@vger.kernel.org,
	linux-kernel@vger.kernel.org, ak@linux.intel.com,
	dave.hansen@intel.com, kirill.shutemov@linux.intel.com
Subject: Re: [PATCH v2 1/6] mm/gup: Provide gup_get_pte() more generic
Date: Thu, 26 Nov 2020 12:43:00 +0000	[thread overview]
Message-ID: <20201126124300.GP4327@casper.infradead.org> (raw)
In-Reply-To: <20201126121121.036370527@infradead.org>

On Thu, Nov 26, 2020 at 01:01:15PM +0100, Peter Zijlstra wrote:
> +#ifdef CONFIG_GUP_GET_PTE_LOW_HIGH
> +/*
> + * WARNING: only to be used in the get_user_pages_fast() implementation.
> + * With get_user_pages_fast(), we walk down the pagetables without taking any
> + * locks.  For this we would like to load the pointers atomically, but sometimes
> + * that is not possible (e.g. without expensive cmpxchg8b on x86_32 PAE).  What
> + * we do have is the guarantee that a PTE will only either go from not present
> + * to present, or present to not present or both -- it will not switch to a
> + * completely different present page without a TLB flush in between; something
> + * that we are blocking by holding interrupts off.

I feel like this comment needs some love.  How about:

 * For walking the pagetables without holding any locks.  Some architectures
 * (eg x86-32 PAE) cannot load the entries atomically without using
 * expensive instructions.  We are guaranteed that a PTE will only either go
 * from not present to present, or present to not present -- it will not
 * switch to a completely different present page without a TLB flush
 * inbetween; which we are blocking by holding interrupts off.

And it would be nice to have an assertion that interrupts are disabled
in the code.  Because comments are nice, but nobody reads them.

> +static inline pte_t ptep_get_lockless(pte_t *ptep)
> +{
> +	pte_t pte;
> +
> +	do {
> +		pte.pte_low = ptep->pte_low;
> +		smp_rmb();
> +		pte.pte_high = ptep->pte_high;
> +		smp_rmb();
> +	} while (unlikely(pte.pte_low != ptep->pte_low));
> +
> +	return pte;
> +}

  reply	other threads:[~2020-11-26 12:43 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-26 12:01 [PATCH v2 0/6] perf/mm: Fix PERF_SAMPLE_*_PAGE_SIZE Peter Zijlstra
2020-11-26 12:01 ` [PATCH v2 1/6] mm/gup: Provide gup_get_pte() more generic Peter Zijlstra
2020-11-26 12:43   ` Matthew Wilcox [this message]
2020-11-26 13:02     ` Peter Zijlstra
2020-12-03  9:07   ` [tip: perf/core] " tip-bot2 for Peter Zijlstra
2020-12-03  9:24   ` tip-bot2 for Peter Zijlstra
2020-11-26 12:01 ` [PATCH v2 2/6] mm: Introduce pXX_leaf_size() Peter Zijlstra
2020-11-26 12:43   ` Matthew Wilcox
2020-12-03  9:07   ` [tip: perf/core] " tip-bot2 for Peter Zijlstra
2020-12-03  9:24   ` tip-bot2 for Peter Zijlstra
2020-11-26 12:01 ` [PATCH v2 3/6] perf/core: Fix arch_perf_get_page_size() Peter Zijlstra
2020-11-26 12:34   ` Matthew Wilcox
2020-11-26 12:42     ` Peter Zijlstra
2020-11-26 12:56       ` Matthew Wilcox
2020-11-26 13:06         ` Peter Zijlstra
2020-11-26 13:27           ` Matthew Wilcox
2020-12-03  9:07       ` [tip: perf/core] " tip-bot2 for Peter Zijlstra
2020-12-03  9:24       ` tip-bot2 for Peter Zijlstra
2020-11-26 12:01 ` [PATCH v2 4/6] arm64/mm: Implement pXX_leaf_size() support Peter Zijlstra
2020-11-26 12:57   ` Peter Zijlstra
2020-11-26 14:32     ` Will Deacon
2020-12-03  9:07     ` [tip: perf/core] " tip-bot2 for Peter Zijlstra
2020-12-03  9:24     ` tip-bot2 for Peter Zijlstra
2020-11-26 12:01 ` [PATCH v2 5/6] sparc64/mm: " Peter Zijlstra
2020-12-03  9:07   ` [tip: perf/core] " tip-bot2 for Peter Zijlstra
2020-12-03  9:24   ` tip-bot2 for Peter Zijlstra
2020-12-09 18:44   ` tip-bot2 for Peter Zijlstra
2020-11-26 12:01 ` [PATCH v2 6/6] powerpc/8xx: " Peter Zijlstra
2020-12-03  9:07   ` [tip: perf/core] " tip-bot2 for Peter Zijlstra
2020-12-03  9:24   ` tip-bot2 for Peter Zijlstra
2020-12-09 18:44   ` tip-bot2 for Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201126124300.GP4327@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=acme@kernel.org \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=catalin.marinas@arm.com \
    --cc=christophe.leroy@csgroup.eu \
    --cc=dave.hansen@intel.com \
    --cc=davem@davemloft.net \
    --cc=eranian@google.com \
    --cc=jolsa@redhat.com \
    --cc=kan.liang@linux.intel.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@kernel.org \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    --cc=peterz@infradead.org \
    --cc=sparclinux@vger.kernel.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).