All of lore.kernel.org
 help / color / mirror / Atom feed
From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
To: "Eads, Gage" <gage.eads@intel.com>,
	Ola Liljedahl <Ola.Liljedahl@arm.com>,
	"dev@dpdk.org" <dev@dpdk.org>
Cc: "arybchenko@solarflare.com" <arybchenko@solarflare.com>,
	"jerinj@marvell.com" <jerinj@marvell.com>,
	"chaozhu@linux.vnet.ibm.com" <chaozhu@linux.vnet.ibm.com>,
	nd <nd@arm.com>, "Richardson, Bruce" <bruce.richardson@intel.com>,
	"Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
	"hemant.agrawal@nxp.com" <hemant.agrawal@nxp.com>,
	"olivier.matz@6wind.com" <olivier.matz@6wind.com>,
	"Gavin Hu (Arm Technology China)" <Gavin.Hu@arm.com>,
	nd <nd@arm.com>
Subject: Re: [PATCH 1/1] eal: add 128-bit cmpset (x86-64 only)
Date: Mon, 4 Feb 2019 18:33:25 +0000	[thread overview]
Message-ID: <AM6PR08MB36720611C7AD7DCD9AC7292B986D0@AM6PR08MB3672.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <9184057F7FC11744A2107296B6B8EB1E541CE1FE@FMSMSX108.amr.corp.intel.com>

> >
> > On Mon, 2019-01-28 at 11:29 -0600, Gage Eads wrote:
> > > This operation can be used for non-blocking algorithms, such as a
> > > non-blocking stack or ring.
> > >
> > > Signed-off-by: Gage Eads <gage.eads@intel.com>
> > > ---
> > >  .../common/include/arch/x86/rte_atomic_64.h        | 31 +++++++++++
> > >  lib/librte_eal/common/include/generic/rte_atomic.h | 65
> > > ++++++++++++++++++++++
> > >  2 files changed, 96 insertions(+)
> > >
> > > diff --git a/lib/librte_eal/common/include/arch/x86/rte_atomic_64.h
> > > b/lib/librte_eal/common/include/arch/x86/rte_atomic_64.h
> > > index fd2ec9c53..b7b90b83e 100644
> > > --- a/lib/librte_eal/common/include/arch/x86/rte_atomic_64.h
> > > +++ b/lib/librte_eal/common/include/arch/x86/rte_atomic_64.h
> > > @@ -34,6 +34,7 @@
> > >  /*
> > >   * Inspired from FreeBSD src/sys/amd64/include/atomic.h
> > >   * Copyright (c) 1998 Doug Rabson
> > > + * Copyright (c) 2019 Intel Corporation
> > >   * All rights reserved.
> > >   */
> > >
> > > @@ -46,6 +47,7 @@
> > >
> > >  #include <stdint.h>
> > >  #include <rte_common.h>
> > > +#include <rte_compat.h>
> > >  #include <rte_atomic.h>
> > >
> > >  /*------------------------- 64 bit atomic operations
> > > ------------------------ -*/ @@ -208,4 +210,33 @@ static inline void
> > > rte_atomic64_clear(rte_atomic64_t *v)
> > >  }
> > >  #endif
> > >
> > > +static inline int __rte_experimental
> > __rte_always_inline?
> >
> > > +rte_atomic128_cmpset(volatile rte_int128_t *dst,
> > No need to declare the location volatile. Volatile doesn't do what you
> > think it does.
> > https://youtu.be/lkgszkPnV8g?t=1027
> >
> 
> I made this volatile to match the existing rte_atomicN_cmpset definitions,
> which presumably have a good reason for using the keyword. Maintainers, any
> input here?
> 
> >
> > > +		     rte_int128_t *exp,
> > I would declare 'exp' const as well and document that 'exp' is not
> > updated (with the old value) for a failure. The reason being that
> > ARMv8.0/AArch64 cannot atomically read the old value without also
> > writing the location and that is bad for performance (unnecessary
> > writes leads to unnecessary contention and worse scalability). And the
> > user must anyway read the location (in the start of the critical
> > section) using e.g. non-atomic 64-bit reads so there isn't actually any
> requirement for an atomic 128-bit read of the location.
> >
> 
> Will change in v2.
> 
IMO, we should not change the definition of this API, because
1) This API will differ from __atomic_compare_exchange_n API. It will be a new API to learn for the users.
2) The definition in this patch will make it easy to replace this API call with __atomic_xxx API (whenever it supports 128b natively on all the platforms)
3) I do not see any documentation in [1] indicating that the 'read on failure' will be an atomic read.

[1] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html

> > >  rte_int128_t *src,
> > const rte_int128_t *src?
> 
> Sure, I don't see any harm in using const.
> 
> >
> > But why are we not passing 'exp' and 'src' by value? That works great,
> > even with structs. Passing by value simplifies the compiler's life,
> > especially if the call is inlined. Ask a compiler developer.
> 
> I ran objdump on the nb_stack code with both approaches, and pass-by-
> reference resulted in fewer overall x86_64 assembly ops.
> PBV: 100 ops for push, 97 ops for pop
> PBR: 92 ops for push, 84 ops for pop
> 
> (Using the in-progress v5 nb_stack code)
> 
> Another factor -- though much less compelling -- is that with pass-by-reference,
> the user can create a 16B structure and cast it to rte_int128_t when they call
> rte_atomic128_cmpset, whereas with pass-by-value they need to put that
> struct in a union with rte_int128_t.
> 
> >
> > > +		     unsigned int weak,
> > > +		     enum rte_atomic_memmodel_t success,
> > > +		     enum rte_atomic_memmodel_t failure) {
> > > +	RTE_SET_USED(weak);
> > > +	RTE_SET_USED(success);
> > > +	RTE_SET_USED(failure);
> > > +	uint8_t res;
> > > +
> > > +	asm volatile (
> > > +		      MPLOCKED
> > > +		      "cmpxchg16b %[dst];"
> > > +		      " sete %[res]"
> > > +		      : [dst] "=m" (dst->val[0]),
> > > +			"=A" (exp->val[0]),
> > > +			[res] "=r" (res)
> > > +		      : "c" (src->val[1]),
> > > +			"b" (src->val[0]),
> > > +			"m" (dst->val[0]),
> > > +			"d" (exp->val[1]),
> > > +			"a" (exp->val[0])
> > > +		      : "memory");
> > > +
> > > +	return res;
> > > +}
> > > +
> > >  #endif /* _RTE_ATOMIC_X86_64_H_ */
> > > diff --git a/lib/librte_eal/common/include/generic/rte_atomic.h
> > > b/lib/librte_eal/common/include/generic/rte_atomic.h
> > > index b99ba4688..8d612d566 100644
> > > --- a/lib/librte_eal/common/include/generic/rte_atomic.h
> > > +++ b/lib/librte_eal/common/include/generic/rte_atomic.h
> > > @@ -14,6 +14,7 @@
> > >
> > >  #include <stdint.h>
> > >  #include <rte_common.h>
> > > +#include <rte_compat.h>
> > >
> > >  #ifdef __DOXYGEN__
> > >
> > > @@ -1082,4 +1083,68 @@ static inline void
> > > rte_atomic64_clear(rte_atomic64_t
> > > *v)
> > >  }
> > >  #endif
> > >
> > > +/*------------------------ 128 bit atomic operations
> > > +------------------------
> > > -*/
> > > +
> > > +/**
> > > + * 128-bit integer structure.
> > > + */
> > > +typedef struct {
> > > +	uint64_t val[2];
> > > +} __rte_aligned(16) rte_int128_t;
> > So we can't use __int128?
> >
> 
> I'll put it in a union with val[2], in case any implementations want to use it.
> 
> Thanks,
> Gage
> 
> [snip]

  parent reply	other threads:[~2019-02-04 18:33 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-28 17:29 [PATCH 0/1] Add 128-bit compare and set Gage Eads
2019-01-28 17:29 ` [PATCH 1/1] eal: add 128-bit cmpset (x86-64 only) Gage Eads
2019-01-28 23:01   ` Ola Liljedahl
2019-02-01 17:06     ` Eads, Gage
2019-02-01 19:01       ` Ola Liljedahl
2019-02-01 19:28         ` Eads, Gage
2019-02-01 19:43           ` Ola Liljedahl
2019-02-01 21:05             ` Eads, Gage
2019-02-01 23:11               ` Ola Liljedahl
2019-02-04 18:33       ` Honnappa Nagarahalli [this message]
2019-01-31  5:48   ` Honnappa Nagarahalli
2019-02-01 17:11     ` Eads, Gage
2019-02-22 15:46 ` [PATCH v2 0/1] Add 128-bit compare and set Gage Eads
2019-02-22 15:46   ` [PATCH v2 1/1] eal: add 128-bit cmpxchg (x86-64 only) Gage Eads
2019-03-04 20:19     ` Honnappa Nagarahalli
2019-03-04 20:47       ` Eads, Gage
2019-03-04 20:51   ` [PATCH v3 0/1] Add 128-bit compare and set Gage Eads
2019-03-04 20:51     ` [PATCH v3 1/1] eal: add 128-bit compare exchange (x86-64 only) Gage Eads
2019-03-27 23:12       ` Thomas Monjalon
2019-03-28 16:22         ` Eads, Gage
2019-04-03 17:34     ` [PATCH v4 0/1] Add 128-bit compare and set Gage Eads
2019-04-03 17:34       ` [PATCH v4 1/1] eal: add 128-bit compare exchange (x86-64 only) Gage Eads
2019-04-03 19:04         ` Thomas Monjalon
2019-04-03 19:21           ` Eads, Gage
2019-04-03 19:27             ` Thomas Monjalon
2019-04-03 19:35 ` [PATCH v5] eal/x86: add 128-bit atomic compare exchange Thomas Monjalon
2019-04-03 19:44   ` [PATCH v6] " Gage Eads
2019-04-03 20:01     ` Thomas Monjalon
2019-04-04 11:47     ` Ferruh Yigit
2019-04-04 12:08       ` Thomas Monjalon
2019-04-04 12:12         ` Thomas Monjalon
2019-04-04 12:14           ` Eads, Gage
2019-04-04 12:18             ` Thomas Monjalon
2019-04-04 12:22               ` Eads, Gage
2019-04-04 12:24               ` Eads, Gage
2019-04-04 12:52               ` Ferruh Yigit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AM6PR08MB36720611C7AD7DCD9AC7292B986D0@AM6PR08MB3672.eurprd08.prod.outlook.com \
    --to=honnappa.nagarahalli@arm.com \
    --cc=Gavin.Hu@arm.com \
    --cc=Ola.Liljedahl@arm.com \
    --cc=arybchenko@solarflare.com \
    --cc=bruce.richardson@intel.com \
    --cc=chaozhu@linux.vnet.ibm.com \
    --cc=dev@dpdk.org \
    --cc=gage.eads@intel.com \
    --cc=hemant.agrawal@nxp.com \
    --cc=jerinj@marvell.com \
    --cc=konstantin.ananyev@intel.com \
    --cc=nd@arm.com \
    --cc=olivier.matz@6wind.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.