All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nicholas Piggin <npiggin@gmail.com>
To: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: benh@kernel.crashing.org, mpe@ellerman.id.au, anton@samba.org,
	paulus@samba.org, linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH 13/13] powerpc: rewrite local_t using soft_irq
Date: Fri, 16 Sep 2016 21:01:20 +1000	[thread overview]
Message-ID: <20160916210120.395a52be@roar.ozlabs.ibm.com> (raw)
In-Reply-To: <1473944523-624-14-git-send-email-maddy@linux.vnet.ibm.com>

On Thu, 15 Sep 2016 18:32:03 +0530
Madhavan Srinivasan <maddy@linux.vnet.ibm.com> wrote:

> Local atomic operations are fast and highly reentrant per CPU counters.
> Used for percpu variable updates. Local atomic operations only guarantee
> variable modification atomicity wrt the CPU which owns the data and
> these needs to be executed in a preemption safe way.
> 
> Here is the design of this patch. Since local_* operations
> are only need to be atomic to interrupts (IIUC), we have two options.
> Either replay the "op" if interrupted or replay the interrupt after
> the "op". Initial patchset posted was based on implementing local_* operation
> based on CR5 which replay's the "op". Patchset had issues in case of
> rewinding the address pointor from an array. This make the slow patch
> really slow. Since CR5 based implementation proposed using __ex_table to find
> the rewind addressr, this rasied concerns about size of __ex_table and vmlinux.
> 
> https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-December/123115.html
> 
> But this patch uses, local_irq_pmu_save to soft_disable
> interrupts (including PMIs). After finishing the "op", local_irq_pmu_restore()
> called and correspondingly interrupts are replayed if any occured.
> 
> patch re-write the current local_* functions to use arch_local_irq_disbale.
> Base flow for each function is
> 
> {
> 	local_irq_pmu_save(flags)
> 	load
> 	..
> 	store
> 	local_irq_pmu_restore(flags)
> }
> 
> Reason for the approach is that, currently l[w/d]arx/st[w/d]cx.
> instruction pair is used for local_* operations, which are heavy
> on cycle count and they dont support a local variant. So to
> see whether the new implementation helps, used a modified
> version of Rusty's benchmark code on local_t.
> 
> https://lkml.org/lkml/2008/12/16/450
> 
> Modifications to Rusty's benchmark code:
> - Executed only local_t test
> 
> Here are the values with the patch.
> 
> Time in ns per iteration
> 
> Local_t             Without Patch           With Patch
> 
> _inc                        28              8
> _add                        28              8
> _read                       3               3
> _add_return         28              7
> 
> Currently only asm/local.h has been rewrite, and also
> the entire change is tested only in PPC64 (pseries guest)
> and PPC64 host (LE)
> 
> TODO:
> 	- local_cmpxchg and local_xchg needs modification.
> 
> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
> ---
>  arch/powerpc/include/asm/local.h | 94 ++++++++++++++++++++++++++++------------
>  1 file changed, 66 insertions(+), 28 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/local.h b/arch/powerpc/include/asm/local.h
> index b8da91363864..fb5728abb4e9 100644
> --- a/arch/powerpc/include/asm/local.h
> +++ b/arch/powerpc/include/asm/local.h
> @@ -3,6 +3,9 @@
>  
>  #include <linux/percpu.h>
>  #include <linux/atomic.h>
> +#include <linux/irqflags.h>
> +
> +#include <asm/hw_irq.h>
>  
>  typedef struct
>  {
> @@ -14,24 +17,50 @@ typedef struct
>  #define local_read(l)	atomic_long_read(&(l)->a)
>  #define local_set(l,i)	atomic_long_set(&(l)->a, (i))
>  
> -#define local_add(i,l)	atomic_long_add((i),(&(l)->a))
> -#define local_sub(i,l)	atomic_long_sub((i),(&(l)->a))
> -#define local_inc(l)	atomic_long_inc(&(l)->a)
> -#define local_dec(l)	atomic_long_dec(&(l)->a)
> +static __inline__ void local_add(long i, local_t *l)
> +{
> +	long t;
> +	unsigned long flags;
> +
> +	local_irq_pmu_save(flags);
> +	__asm__ __volatile__(
> +	PPC_LL" %0,0(%2)\n\
> +	add     %0,%1,%0\n"
> +	PPC_STL" %0,0(%2)\n"
> +	: "=&r" (t)
> +	: "r" (i), "r" (&(l->a.counter)));
> +	local_irq_pmu_restore(flags);
> +}
> +
> +static __inline__ void local_sub(long i, local_t *l)
> +{
> +	long t;
> +	unsigned long flags;
> +
> +	local_irq_pmu_save(flags);
> +	__asm__ __volatile__(
> +	PPC_LL" %0,0(%2)\n\
> +	subf    %0,%1,%0\n"
> +	PPC_STL" %0,0(%2)\n"
> +	: "=&r" (t)
> +	: "r" (i), "r" (&(l->a.counter)));
> +	local_irq_pmu_restore(flags);
> +}
>  
>  static __inline__ long local_add_return(long a, local_t *l)
>  {
>  	long t;
> +	unsigned long flags;
>  
> +	local_irq_pmu_save(flags);
>  	__asm__ __volatile__(
> -"1:"	PPC_LLARX(%0,0,%2,0) "			# local_add_return\n\
> +	PPC_LL" %0,0(%2)\n\
>  	add	%0,%1,%0\n"
> -	PPC405_ERR77(0,%2)
> -	PPC_STLCX	"%0,0,%2 \n\
> -	bne-	1b"
> +	PPC_STL	"%0,0(%2)\n"
>  	: "=&r" (t)
>  	: "r" (a), "r" (&(l->a.counter))
>  	: "cc", "memory");
> +	local_irq_pmu_restore(flags);

Are all your clobbers correct? You might not be clobbering "cc" here
anymore, for example. Could you double check those? Otherwise, awesome
patch!

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>

  parent reply	other threads:[~2016-09-16 11:01 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-15 13:01 [PATCH 00/13] powerpc: "paca->soft_enabled" based local atomic operation implementation Madhavan Srinivasan
2016-09-15 13:01 ` [PATCH 01/13] powerpc: Add #defs for paca->soft_enabled flags Madhavan Srinivasan
2016-09-15 13:01 ` [PATCH 02/13] powerpc: Cleanup to use IRQ_DISABLE_MASK_* macros for paca->soft_enabled update Madhavan Srinivasan
2016-09-16  9:47   ` Nicholas Piggin
2016-09-19  4:04     ` Madhavan Srinivasan
2016-09-15 13:01 ` [PATCH 03/13] powerpc: move set_soft_enabled() and rename Madhavan Srinivasan
2016-09-16  9:50   ` Nicholas Piggin
2016-09-19  4:05     ` Madhavan Srinivasan
2016-09-15 13:01 ` [PATCH 04/13] powerpc: Use soft_enabled_set api to update paca->soft_enabled Madhavan Srinivasan
2016-09-16  9:53   ` Nicholas Piggin
2016-09-16 11:43     ` David Laight
2016-09-16 11:59       ` Nicholas Piggin
2016-09-16 13:22         ` David Laight
2016-09-19  2:52           ` Nicholas Piggin
2016-09-19  5:32             ` Madhavan Srinivasan
2016-09-19  5:05       ` Madhavan Srinivasan
2016-09-19  4:11     ` Madhavan Srinivasan
2016-09-15 13:01 ` [PATCH 05/13] powerpc: Add soft_enabled manipulation functions Madhavan Srinivasan
2016-09-16  9:57   ` Nicholas Piggin
2016-09-19  5:41     ` Madhavan Srinivasan
2016-09-15 13:01 ` [PATCH 06/13] powerpc: reverse the soft_enable logic Madhavan Srinivasan
2016-09-16 10:05   ` Nicholas Piggin
2016-09-19  5:45     ` Madhavan Srinivasan
2016-09-15 13:01 ` [PATCH 07/13] powerpc: Avoid using EXCEPTION_PROLOG_1 macro in MASKABLE_* Madhavan Srinivasan
2016-09-16 10:08   ` Nicholas Piggin
2016-09-15 13:01 ` [PATCH 08/13] powerpc: Add new _EXCEPTION_PROLOG_1 macro Madhavan Srinivasan
2016-09-16 10:12   ` Nicholas Piggin
2016-09-19  5:54     ` Madhavan Srinivasan
2016-09-15 13:01 ` [PATCH 09/13] powerpc: Introduce new mask bit for soft_enabled Madhavan Srinivasan
2016-09-16 10:16   ` Nicholas Piggin
2016-09-19  5:57     ` Madhavan Srinivasan
2016-09-15 13:02 ` [PATCH 10/13] powerpc: Add "bitmask" paramater to MASKABLE_* macros Madhavan Srinivasan
2016-09-16 11:03   ` Nicholas Piggin
2016-09-19  5:58     ` Madhavan Srinivasan
2016-09-15 13:02 ` [PATCH 11/13] powerpc: Add support to mask perf interrupts and replay them Madhavan Srinivasan
2016-09-16 10:47   ` Nicholas Piggin
2016-09-15 13:02 ` [PATCH 12/13] powerpc: Add a Kconfig and a functions to set new soft_enabled mask Madhavan Srinivasan
2016-09-16 10:34   ` Nicholas Piggin
2016-09-16 10:56   ` Nicholas Piggin
2016-09-19  6:03     ` Madhavan Srinivasan
2016-09-15 13:02 ` [PATCH 13/13] powerpc: rewrite local_t using soft_irq Madhavan Srinivasan
2016-09-15 16:55   ` kbuild test robot
2016-09-16 11:01   ` Nicholas Piggin [this message]
2016-09-16 11:08 ` [PATCH 00/13] powerpc: "paca->soft_enabled" based local atomic operation implementation Nicholas Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160916210120.395a52be@roar.ozlabs.ibm.com \
    --to=npiggin@gmail.com \
    --cc=anton@samba.org \
    --cc=benh@kernel.crashing.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=maddy@linux.vnet.ibm.com \
    --cc=mpe@ellerman.id.au \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.