linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Christophe Leroy <christophe.leroy@c-s.fr>
To: Leonardo Bras <leonardo@linux.ibm.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Paul Mackerras <paulus@samba.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Arnd Bergmann <arnd@arndb.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
	Nicholas Piggin <npiggin@gmail.com>,
	Steven Price <steven.price@arm.com>,
	Robin Murphy <robin.murphy@arm.com>,
	Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>,
	Balbir Singh <bsingharora@gmail.com>,
	Reza Arbab <arbab@linux.ibm.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Allison Randal <allison@lohutok.net>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Mike Rapoport <rppt@linux.ibm.com>,
	Michal Suchanek <msuchanek@suse.de>
Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org,
	kvm-ppc@vger.kernel.org, linux-arch@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [PATCH v6 03/11] powerpc/mm: Adds arch-specificic functions to track lockless pgtable walks
Date: Thu, 6 Feb 2020 06:46:18 +0100	[thread overview]
Message-ID: <1311ce1c-7e5a-f7c4-2ab2-c03e124ca1c1@c-s.fr> (raw)
In-Reply-To: <20200206030900.147032-4-leonardo@linux.ibm.com>



Le 06/02/2020 à 04:08, Leonardo Bras a écrit :
> On powerpc, we need to do some lockless pagetable walks from functions
> that already have disabled interrupts, specially from real mode with
> MSR[EE=0].
> 
> In these contexts, disabling/enabling interrupts can be very troubling.

When interrupts are already disabled, the flag returned when disabling 
it will be such that when we restore it later, interrupts remain 
disabled, so what's the problem ?

> 
> So, this arch-specific implementation features functions with an extra
> argument that allows interrupt enable/disable to be skipped:
> __begin_lockless_pgtbl_walk() and __end_lockless_pgtbl_walk().
> 
> Functions similar to the generic ones are also exported, by calling
> the above functions with parameter {en,dis}able_irq = true.
> 
> Signed-off-by: Leonardo Bras <leonardo@linux.ibm.com>
> ---
>   arch/powerpc/include/asm/book3s/64/pgtable.h |  6 ++
>   arch/powerpc/mm/book3s64/pgtable.c           | 86 +++++++++++++++++++-
>   2 files changed, 91 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index 201a69e6a355..78f6ffb1bb3e 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -1375,5 +1375,11 @@ static inline bool pgd_is_leaf(pgd_t pgd)
>   	return !!(pgd_raw(pgd) & cpu_to_be64(_PAGE_PTE));
>   }
>   
> +#define __HAVE_ARCH_LOCKLESS_PGTBL_WALK_CONTROL
> +unsigned long begin_lockless_pgtbl_walk(void);
> +unsigned long __begin_lockless_pgtbl_walk(bool disable_irq);
> +void end_lockless_pgtbl_walk(unsigned long irq_mask);
> +void __end_lockless_pgtbl_walk(unsigned long irq_mask, bool enable_irq);
> +

Why not make them static inline just like the generic ones ?

>   #endif /* __ASSEMBLY__ */
>   #endif /* _ASM_POWERPC_BOOK3S_64_PGTABLE_H_ */
> diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c
> index 2bf7e1b4fd82..535613030363 100644
> --- a/arch/powerpc/mm/book3s64/pgtable.c
> +++ b/arch/powerpc/mm/book3s64/pgtable.c
> @@ -82,6 +82,7 @@ static void do_nothing(void *unused)
>   {
>   
>   }
> +

Is this blank line related to the patch ?

>   /*
>    * Serialize against find_current_mm_pte which does lock-less
>    * lookup in page tables with local interrupts disabled. For huge pages
> @@ -98,6 +99,89 @@ void serialize_against_pte_lookup(struct mm_struct *mm)
>   	smp_call_function_many(mm_cpumask(mm), do_nothing, NULL, 1);
>   }
>   
> +/* begin_lockless_pgtbl_walk: Must be inserted before a function call that does
> + *   lockless pagetable walks, such as __find_linux_pte().
> + * This version allows setting disable_irq=false, so irqs are not touched, which
> + *   is quite useful for running when ints are already disabled (like real-mode)
> + */
> +inline
> +unsigned long __begin_lockless_pgtbl_walk(bool disable_irq)
> +{
> +	unsigned long irq_mask = 0;
> +
> +	/*
> +	 * Interrupts must be disabled during the lockless page table walk.
> +	 * That's because the deleting or splitting involves flushing TLBs,
> +	 * which in turn issues interrupts, that will block when disabled.
> +	 *
> +	 * When this function is called from realmode with MSR[EE=0],
> +	 * it's not needed to touch irq, since it's already disabled.
> +	 */
> +	if (disable_irq)
> +		local_irq_save(irq_mask);
> +
> +	/*
> +	 * This memory barrier pairs with any code that is either trying to
> +	 * delete page tables, or split huge pages. Without this barrier,
> +	 * the page tables could be read speculatively outside of interrupt
> +	 * disabling or reference counting.
> +	 */
> +	smp_mb();
> +
> +	return irq_mask;
> +}
> +EXPORT_SYMBOL(__begin_lockless_pgtbl_walk);
> +
> +/* begin_lockless_pgtbl_walk: Must be inserted before a function call that does
> + *   lockless pagetable walks, such as __find_linux_pte().
> + * This version is used by generic code, and always assume irqs will be disabled
> + */
> +unsigned long begin_lockless_pgtbl_walk(void)
> +{
> +	return __begin_lockless_pgtbl_walk(true);
> +}
> +EXPORT_SYMBOL(begin_lockless_pgtbl_walk);

Even more than begin_lockless_pgtbl_walk(), this one is worth being 
static inline in the H file.

> +
> +/*
> + * __end_lockless_pgtbl_walk: Must be inserted after the last use of a pointer
> + *   returned by a lockless pagetable walk, such as __find_linux_pte()
> + * This version allows setting enable_irq=false, so irqs are not touched, which
> + *   is quite useful for running when ints are already disabled (like real-mode)
> + */
> +inline void __end_lockless_pgtbl_walk(unsigned long irq_mask, bool enable_irq)
> +{
> +	/*
> +	 * This memory barrier pairs with any code that is either trying to
> +	 * delete page tables, or split huge pages. Without this barrier,
> +	 * the page tables could be read speculatively outside of interrupt
> +	 * disabling or reference counting.
> +	 */
> +	smp_mb();
> +
> +	/*
> +	 * Interrupts must be disabled during the lockless page table walk.
> +	 * That's because the deleting or splitting involves flushing TLBs,
> +	 * which in turn issues interrupts, that will block when disabled.
> +	 *
> +	 * When this function is called from realmode with MSR[EE=0],
> +	 * it's not needed to touch irq, since it's already disabled.
> +	 */
> +	if (enable_irq)
> +		local_irq_restore(irq_mask);
> +}
> +EXPORT_SYMBOL(__end_lockless_pgtbl_walk);
> +
> +/*
> + * end_lockless_pgtbl_walk: Must be inserted after the last use of a pointer
> + *   returned by a lockless pagetable walk, such as __find_linux_pte()
> + * This version is used by generic code, and always assume irqs will be enabled
> + */
> +void end_lockless_pgtbl_walk(unsigned long irq_mask)
> +{
> +	__end_lockless_pgtbl_walk(irq_mask, true);
> +}
> +EXPORT_SYMBOL(end_lockless_pgtbl_walk);
> +
>   /*
>    * We use this to invalidate a pmdp entry before switching from a
>    * hugepte to regular pmd entry.
> @@ -487,7 +571,7 @@ static int __init setup_disable_tlbie(char *str)
>   	tlbie_capable = false;
>   	tlbie_enabled = false;
>   
> -        return 1;
> +	return 1;

Is that related to this patch at all ?

>   }
>   __setup("disable_tlbie", setup_disable_tlbie);
>   
> 

Christophe

  parent reply	other threads:[~2020-02-06  5:46 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-06  3:08 [PATCH v6 00/11] Introduces new functions for tracking lockless pagetable walks Leonardo Bras
2020-02-06  3:08 ` Leonardo Bras
2020-02-06  3:08 ` [PATCH v6 01/11] asm-generic/pgtable: Adds generic functions to track lockless pgtable walks Leonardo Bras
2020-02-06  3:08   ` Leonardo Bras
2020-02-06  5:54   ` Christophe Leroy
2020-02-06  5:54     ` Christophe Leroy
2020-02-07  2:19     ` Leonardo Bras
2020-02-07  2:19       ` Leonardo Bras
2020-02-07  5:39   ` kbuild test robot
2020-02-07  5:39     ` kbuild test robot
2020-02-06  3:08 ` [PATCH v6 02/11] mm/gup: Use functions to track lockless pgtbl walks on gup_pgd_range Leonardo Bras
2020-02-06  3:08   ` Leonardo Bras
2020-02-06  3:25   ` Leonardo Bras
2020-02-06  3:25     ` Leonardo Bras
2020-02-07 22:54     ` John Hubbard
2020-02-07 22:54       ` John Hubbard
2020-02-17 20:55       ` Leonardo Bras
2020-02-17 20:55         ` Leonardo Bras
2020-10-15 14:46     ` Michal Suchánek
2020-10-16  3:27       ` Aneesh Kumar K.V
2020-02-07  1:19   ` kbuild test robot
2020-02-07  1:19     ` kbuild test robot
2020-02-07  8:01   ` kbuild test robot
2020-02-07  8:01     ` kbuild test robot
2020-02-06  3:08 ` [PATCH v6 03/11] powerpc/mm: Adds arch-specificic functions to track lockless pgtable walks Leonardo Bras
2020-02-06  3:08   ` Leonardo Bras
2020-02-06  5:46   ` Christophe Leroy [this message]
2020-02-06  5:46     ` Christophe Leroy
2020-02-07  4:38     ` Leonardo Bras
2020-02-07  4:38       ` Leonardo Bras
2020-02-17 20:32       ` Leonardo Bras
2020-02-17 20:32         ` Leonardo Bras
2020-02-06  3:08 ` [PATCH v6 04/11] powerpc/mce_power: Use functions to track lockless pgtbl walks Leonardo Bras
2020-02-06  3:08   ` Leonardo Bras
2020-02-06  5:48   ` Christophe Leroy
2020-02-06  5:48     ` Christophe Leroy
2020-02-07  4:00     ` Leonardo Bras
2020-02-07  4:00       ` Leonardo Bras
2020-02-06  3:08 ` [PATCH v6 05/11] powerpc/perf: " Leonardo Bras
2020-02-06  3:08   ` Leonardo Bras
2020-02-06  3:08 ` [PATCH v6 06/11] powerpc/mm/book3s64/hash: " Leonardo Bras
2020-02-06  3:08   ` Leonardo Bras
2020-02-06  6:06   ` Christophe Leroy
2020-02-06  6:06     ` Christophe Leroy
2020-02-07  3:49     ` Leonardo Bras
2020-02-07  3:49       ` Leonardo Bras
2020-02-06  3:08 ` [PATCH v6 07/11] powerpc/kvm/e500: " Leonardo Bras
2020-02-06  3:08   ` Leonardo Bras
2020-02-06  6:18   ` Christophe Leroy
2020-02-06  6:18     ` Christophe Leroy
2020-02-07  3:10     ` Leonardo Bras
2020-02-07  3:10       ` Leonardo Bras
2020-02-06  3:08 ` [PATCH v6 08/11] powerpc/kvm/book3s_hv: " Leonardo Bras
2020-02-06  3:08   ` Leonardo Bras
2020-02-06  3:08 ` [PATCH v6 09/11] powerpc/kvm/book3s_64: " Leonardo Bras
2020-02-06  3:08   ` Leonardo Bras
2020-02-06  3:08 ` [PATCH v6 10/11] powerpc/mm: Adds counting method to track lockless pagetable walks Leonardo Bras
2020-02-06  3:08   ` Leonardo Bras
2020-02-06  6:23   ` Christophe Leroy
2020-02-06  6:23     ` Christophe Leroy
2020-02-07  1:56     ` Leonardo Bras
2020-02-07  1:56       ` Leonardo Bras
2020-02-06  3:09 ` [PATCH v6 11/11] powerpc/mm/book3s64/pgtable: Uses counting method to skip serializing Leonardo Bras
2020-02-06  3:09   ` Leonardo Bras

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1311ce1c-7e5a-f7c4-2ab2-c03e124ca1c1@c-s.fr \
    --to=christophe.leroy@c-s.fr \
    --cc=akpm@linux-foundation.org \
    --cc=allison@lohutok.net \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=arbab@linux.ibm.com \
    --cc=arnd@arndb.de \
    --cc=benh@kernel.crashing.org \
    --cc=bsingharora@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=leonardo@linux.ibm.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mahesh@linux.vnet.ibm.com \
    --cc=mpe@ellerman.id.au \
    --cc=msuchanek@suse.de \
    --cc=npiggin@gmail.com \
    --cc=paulus@samba.org \
    --cc=robin.murphy@arm.com \
    --cc=rppt@linux.ibm.com \
    --cc=steven.price@arm.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).