All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jerin Jacob <jerin.jacob@caviumnetworks.com>
To: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [PATCH 3/4] eal/arm: Enable lpm/table/pipeline libs
Date: Wed, 2 Dec 2015 16:18:13 +0530	[thread overview]
Message-ID: <20151202104811.GA6337@localhost.localdomain> (raw)
In-Reply-To: <2601191342CEEE43887BDE71AB97725836ACFADA@irsmsx105.ger.corp.intel.com>

On Wed, Dec 02, 2015 at 10:33:44AM +0000, Ananyev, Konstantin wrote:
> Hi everyone,
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jianbo Liu
> > Sent: Wednesday, December 02, 2015 9:50 AM
> > To: Jerin Jacob
> > Cc: dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH 3/4] eal/arm: Enable lpm/table/pipeline libs
> > 
> > On 2 December 2015 at 16:03, Jerin Jacob <jerin.jacob@caviumnetworks.com> wrote:
> > > On Wed, Dec 02, 2015 at 02:54:52PM +0800, Jianbo Liu wrote:
> > >> On 2 December 2015 at 00:41, Jerin Jacob <jerin.jacob@caviumnetworks.com> wrote:
> > >> > On Tue, Dec 01, 2015 at 01:41:15PM -0500, Jianbo Liu wrote:
> > >> >> Adds ARM NEON support for lpm.
> > >> >> And enables table/pipeline libraries which depend on lpm.
> > >> >
> > >> > I already sent the patch on the same yesterday.
> > >> > We can converge the patches after the discussion.
> > >> > Please check "[dpdk-dev] [PATCH 0/3] add lpm support for NEON" on ml
> > >> >
> > >> Yes, I have read your patch. But there are many differences, so I sent
> > >> mine for your reviewing :)
> > >>
> > >> >
> > >> >>
> > >> >> Signed-off-by: Jianbo Liu <jianbo.liu@linaro.org>
> > >> >> ---
> > >> >>  config/defconfig_arm-armv7a-linuxapp-gcc          |  3 -
> > >> >>  config/defconfig_arm64-armv8a-linuxapp-gcc        |  3 -
> > >> >>  lib/librte_eal/common/include/arch/arm/rte_vect.h | 28 ++++++++++
> > >> >>  lib/librte_lpm/rte_lpm.h                          | 68 ++++++++++++++++-------
> > >> >>  4 files changed, 77 insertions(+), 25 deletions(-)
> > >> >>
> > >> >> diff --git a/config/defconfig_arm-armv7a-linuxapp-gcc b/config/defconfig_arm-armv7a-linuxapp-gcc
> > >> >> index cbebd64..efffa1f 100644
> > >> >> --- a/config/defconfig_arm-armv7a-linuxapp-gcc
> > >> >> +++ b/config/defconfig_arm-armv7a-linuxapp-gcc
> > >> >> @@ -53,9 +53,6 @@ CONFIG_RTE_LIBRTE_KNI=n
> > >> >>  CONFIG_RTE_EAL_IGB_UIO=n
> > >> >>
> > >> >>  # fails to compile on ARM
> > >> >> -CONFIG_RTE_LIBRTE_LPM=n
> > >> >> -CONFIG_RTE_LIBRTE_TABLE=n
> > >> >> -CONFIG_RTE_LIBRTE_PIPELINE=n
> > >> >>  CONFIG_RTE_SCHED_VECTOR=n
> > >> >>
> > >> >>  # cannot use those on ARM
> > >> >> diff --git a/config/defconfig_arm64-armv8a-linuxapp-gcc b/config/defconfig_arm64-armv8a-linuxapp-gcc
> > >> >> index 504f3ed..57f7941 100644
> > >> >> --- a/config/defconfig_arm64-armv8a-linuxapp-gcc
> > >> >> +++ b/config/defconfig_arm64-armv8a-linuxapp-gcc
> > >> >> @@ -51,7 +51,4 @@ CONFIG_RTE_LIBRTE_IVSHMEM=n
> > >> >>  CONFIG_RTE_LIBRTE_FM10K_PMD=n
> > >> >>  CONFIG_RTE_LIBRTE_I40E_PMD=n
> > >> >>
> > >> >> -CONFIG_RTE_LIBRTE_LPM=n
> > >> >> -CONFIG_RTE_LIBRTE_TABLE=n
> > >> >> -CONFIG_RTE_LIBRTE_PIPELINE=n
> > >> >>  CONFIG_RTE_SCHED_VECTOR=n
> > >> >> diff --git a/lib/librte_eal/common/include/arch/arm/rte_vect.h b/lib/librte_eal/common/include/arch/arm/rte_vect.h
> > >> >> index a33c054..7437711 100644
> > >> >> --- a/lib/librte_eal/common/include/arch/arm/rte_vect.h
> > >> >> +++ b/lib/librte_eal/common/include/arch/arm/rte_vect.h
> > >> >> @@ -41,6 +41,8 @@ extern "C" {
> > >> >>
> > >> >>  typedef int32x4_t xmm_t;
> > >> >>
> > >> >> +typedef int32x4_t __m128i;
> > >> >> +
> > >> >>  #define      XMM_SIZE        (sizeof(xmm_t))
> > >> >>  #define      XMM_MASK        (XMM_SIZE - 1)
> > >> >>
> > >> >> @@ -53,6 +55,32 @@ typedef union rte_xmm {
> > >> >>       double   pd[XMM_SIZE / sizeof(double)];
> > >> >>  } __attribute__((aligned(16))) rte_xmm_t;
> > >> >>
> > >> >> +static __inline __m128i
> > >> >> +_mm_set_epi32(int i3, int i2, int i1, int i0)
> > >> >> +{
> > >> >> +     int32_t r[4] = {i0, i1, i2, i3};
> > >> >> +
> > >> >> +     return vld1q_s32(r);
> > >> >> +}
> > >> >> +
> > >> >> +static __inline __m128i
> > >> >> +_mm_loadu_si128(__m128i *p)
> > >> >> +{
> > >> >> +     return vld1q_s32((int32_t *)p);
> > >> >> +}
> > >> >> +
> > >> >> +static __inline __m128i
> > >> >> +_mm_set1_epi32(int i)
> > >> >> +{
> > >> >> +     return vdupq_n_s32(i);
> > >> >> +}
> > >> >> +
> > >> >> +static __inline __m128i
> > >> >> +_mm_and_si128(__m128i a, __m128i b)
> > >> >> +{
> > >> >> +     return vandq_s32(a, b);
> > >> >> +}
> > >> >> +
> > >
> > > IMO, it's not always good to emulate GCC defined intrinsics of
> > > other architecture. What if a legacy DPDK application has such mappings
> > > then BOOM, multiple definition, which one is correct? which one
> > > to comment it out? Integration pain starts for DPDK library consumer:-(
> > >
> > They can include rte_vect.h in build/include directly, which is linked correctly
> > to the one for that ARCH, so there is no need to worry about.
> > 
> > 
> > >> >
> > >> > IMO, it makes sense to not emulate the SSE intrinsics with NEON
> > >> > Let's create the rte_vect_* as required. look at the existing patch.
> > >> >
> > >> I thought of creating a layer of SIMD over all the platforms before.
> > >> But can't you see it make things complicated, considering there are
> > >> only few simple intrinsic to implement?
> > >
> > > Not true, There were, a lot of SSE intrinsics needs be to emulated for ACL NEON
> > > implementation if I were to take this approach and emulation comes with
> > > the cost.
> > >
> > No, I will not re-implement all the intrinsic like that .
> > I only do with the simple intrinsic, such as load/store, as you said below.
> > 
> > > So my take is,
> > > lets the each architecture implementation for specific SIMD version of DPDK
> > > API in the library should have the freedom to implement the API in
> > > NATIVE.
> > >
> > > And let's create only rte_vect_* abstraction only for using
> > > that API/library. Which boils down to have very minimal rte_vect_*
> > > abstraction to load, store, set not beyond that.
> > >
> > > This makes clear "contract" between DPDK library and the applications.
> > > and make easy for remaning new architecture  porting effort in DPDK.
> > >
> > Agree.
> > But I reuse existing intrinsic names, and you recreate new ones.
> > And I try to do as few changes as possible, and try to avoid any
> > mistaken which may cause code un-compiled.
> > I think it's design level question, we need to hear what others talk about it.
> > 
> > > Imagine how your proposed function will look like if new architecture
> > > wants to implement "optimized" version of rte_lpm_lookupx4
> > >
> > There is no optimization for this (simple) rte_lpm_lookupx4, otherwise
> > you have done that in your patch.
> > If there is for other new platform, defintely they should do like
> > yours, as you did for NEON ACL.
> > 
> > >
> > >> If do so, we also need to explain to others how to use these interfaces.
> > >> Besides, this patch did the smallest changes to the original code, and
> > >> more likely to be accepted by others.
> > >
> > > other patch makes no changes to IA version of rte_lpm_lookupx4.I thought
> > > that make reviewer easy to review the changes in architecture
> > > perspective.
> > >
> > As I know, they don't enable LPM for PPC, and ARM is the first one to
> > touch this issue.
> > 
> > >>
> > >> >
> > >> >>  #ifdef RTE_ARCH_ARM
> > >> >>  /* NEON intrinsic vqtbl1q_u8() is not supported in ARMv7-A(AArch32) */
> > >> >>  static __inline uint8x16_t
> > >> >> diff --git a/lib/librte_lpm/rte_lpm.h b/lib/librte_lpm/rte_lpm.h
> > >> >> index c299ce2..c76c07d 100644
> > >> >> --- a/lib/librte_lpm/rte_lpm.h
> > >> >> +++ b/lib/librte_lpm/rte_lpm.h
> > >> >> @@ -361,6 +361,47 @@ rte_lpm_lookup_bulk_func(const struct rte_lpm *lpm, const uint32_t * ips,
> > >> >>  /* Mask four results. */
> > >> >>  #define       RTE_LPM_MASKX4_RES     UINT64_C(0x00ff00ff00ff00ff)
> > >> >>
> > >> >> +#if defined(RTE_ARCH_ARM) || defined(RTE_ARCH_ARM64)
> > >> >
> > >> > Separate out arm implementation to the different header file.
> > >> > Too many ifdef looks odd in the header file and difficult to manage.
> > >> >
> > >> But there are many ifdefs already.
> > >> And It seems unreasonable to add a new file only for one small function.
> > >>
> > >
> > > small or big, its matter of each architecture to have
> > > the freedom for the optimized version for the implementation.
> > >
> > > What if  other architecture demands to write this function in assembly
> > > or restructure it for performance improvement?
> > >
> > If there is such demands, should do like that.
> > But I don't see any restructure in your patch, and you still follow
> > the logic as x86, is it worth adding a new file?
> > 
> 
> My preference would also be to put architecture dependent implementation
> into different files. 
> Might be create lib/librte_lpm/arch/(arm|x86)/... here?
> Konstantin  

+1

my existing patch creates lib/librte_lpm/rte_lpm_neon.h instead
of lib/librte_lpm/arch/arm/rte_lpm_neon.h like
lib/librte_hash/rte_cmp_x86.h

I am OK for changing the directory structure as proposed in my next revision
of patch.
Let me know if anyone has any objections/concerns.

Jerin

  reply	other threads:[~2015-12-02 10:48 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-01 18:41 [PATCH 0/4] support acl/lpm/table/pipeline libs for armv7 and armv8 Jianbo Liu
2015-12-01 12:47 ` Jan Viktorin
2015-12-01 20:56   ` Jianbo Liu
2015-12-01 18:41 ` [PATCH 1/4] eal/arm: use RTE_ARM_EAL_RDTSC_USE_PMU in rte_cycle_32.h Jianbo Liu
2015-12-01 12:41   ` Jan Viktorin
2015-12-01 12:43   ` Jan Viktorin
2015-12-01 18:41 ` [PATCH 2/4] eal/acl: enable acl for armv7-a Jianbo Liu
2015-12-01 14:43   ` Jerin Jacob
2015-12-01 14:46     ` Jan Viktorin
2015-12-02  6:14       ` Jianbo Liu
2015-12-01 18:41 ` [PATCH 3/4] eal/arm: Enable lpm/table/pipeline libs Jianbo Liu
2015-12-01 16:41   ` Jerin Jacob
2015-12-01 17:02     ` Jan Viktorin
2015-12-02  7:02     ` Jianbo Liu
     [not found]     ` <CAP4Qi3-5ofDU-2-4KsxFzMC1OpTsc5WjmxcFT2Eu_URA0UBzDw@mail.gmail.com>
2015-12-02  8:03       ` Jerin Jacob
2015-12-02  9:49         ` Jianbo Liu
2015-12-02 10:33           ` Ananyev, Konstantin
2015-12-02 10:48             ` Jerin Jacob [this message]
2015-12-02 13:06               ` Jan Viktorin
2015-12-02 10:39           ` Jerin Jacob
2015-12-02 13:05             ` Jan Viktorin
2015-12-02 13:13             ` Jianbo Liu
2015-12-02 14:34               ` Jerin Jacob
2015-12-02 16:40                 ` Thomas Monjalon
2015-12-02 16:53                   ` Jerin Jacob
2015-12-02 16:57                     ` Thomas Monjalon
2015-12-02 17:38                       ` Jerin Jacob
2015-12-03  9:33                       ` Jerin Jacob
2015-12-03 11:02                         ` Ananyev, Konstantin
2015-12-03 12:17                           ` Jerin Jacob
2015-12-03 12:42                             ` Ananyev, Konstantin
2015-12-03 13:20                               ` Jerin Jacob
2015-12-01 18:41 ` [PATCH 4/4] maintainers: claim resposibility for ARMv7 and ARMv8 Jianbo Liu
2015-12-01 16:44   ` Jerin Jacob
2015-12-03 15:02 ` [PATCH v2 0/3] support acl lib for armv7-a and a small fix Jianbo Liu
2015-12-03 15:02   ` [PATCH v2 1/3] eal/arm: use RTE_ARM_EAL_RDTSC_USE_PMU in rte_cycle_32.h Jianbo Liu
2015-12-08  1:13     ` Thomas Monjalon
2015-12-03 15:02   ` [PATCH v2 2/3] eal/acl: enable acl for armv7-a Jianbo Liu
2015-12-03 15:13     ` Jerin Jacob
2015-12-08  1:18     ` Thomas Monjalon
2015-12-08  1:50       ` Jianbo Liu
2015-12-08  2:23         ` Thomas Monjalon
2015-12-08  7:56           ` Jianbo Liu
2015-12-08 10:03             ` Thomas Monjalon
2015-12-08 10:21               ` Jianbo Liu
2015-12-08 10:38                 ` Thomas Monjalon
2015-12-08 11:27                   ` Jan Viktorin
2015-12-08 10:25               ` Jan Viktorin
2015-12-03 15:02   ` [PATCH v2 3/3] maintainers: claim resposibility for ARMv7 and ARMv8 Jianbo Liu
2015-12-08  1:24   ` [PATCH v2 0/3] support acl lib for armv7-a and a small fix Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151202104811.GA6337@localhost.localdomain \
    --to=jerin.jacob@caviumnetworks.com \
    --cc=dev@dpdk.org \
    --cc=konstantin.ananyev@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.