From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2BDEBC433EF for ; Fri, 1 Oct 2021 21:30:16 +0000 (UTC) Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by mail.kernel.org (Postfix) with ESMTP id 65FD461994 for ; Fri, 1 Oct 2021 21:30:15 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 65FD461994 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=dpdk.org Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 5BC6541161; Fri, 1 Oct 2021 23:30:14 +0200 (CEST) Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by mails.dpdk.org (Postfix) with ESMTP id 8D76E4067E for ; Fri, 1 Oct 2021 23:30:12 +0200 (CEST) X-IronPort-AV: E=McAfee;i="6200,9189,10124"; a="248153994" X-IronPort-AV: E=Sophos;i="5.85,340,1624345200"; d="scan'208";a="248153994" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Oct 2021 14:30:11 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.85,340,1624345200"; d="scan'208";a="540562264" Received: from fmsmsx601.amr.corp.intel.com ([10.18.126.81]) by fmsmga004.fm.intel.com with ESMTP; 01 Oct 2021 14:30:11 -0700 Received: from fmsmsx608.amr.corp.intel.com (10.18.126.88) by fmsmsx601.amr.corp.intel.com (10.18.126.81) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2242.12; Fri, 1 Oct 2021 14:30:11 -0700 Received: from FMSEDG603.ED.cps.intel.com (10.1.192.133) by fmsmsx608.amr.corp.intel.com (10.18.126.88) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2242.12 via Frontend Transport; Fri, 1 Oct 2021 14:30:11 -0700 Received: from NAM10-DM6-obe.outbound.protection.outlook.com (104.47.58.105) by edgegateway.intel.com (192.55.55.68) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2242.12; Fri, 1 Oct 2021 14:30:10 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ld7bzx+1MuAtO8ckTDOiR8PHMDAipkhh2/vW7GOJ3dvNGp4xkaXbNV4Ag+SBO3aAsmYoCeHlxVET6rCU8ManglloRVfKKBQFO5zGl9Pb5pUkZLsLaw0eEYsIr+3Vj8+i4S3UWu02AOSIiaW6IyPvVmNWpJ11hUyqXI3m5qrhT6a+96enYd0oqPDaZkjqYMiMDjObOZVM321EkQb9vVHafeosX64SvscTiWcdkJiLFy3xStlGi+oHtaryw13CipSoH9ZhU3ori0636PYgkygQYTuBAI77y+e3PHdwForziJVJklPgvt9pe9lcR+r4YpPmjhHZVgxMPysiCw32HgIaIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=KW9R33e+Z3sai6jC6jKfjZ7CRrFZLctnmngDw15F0aY=; b=VRE7KWuAd/g91MI5qttGq2JyjBan5stnbwCZ4ZzcpmeyctXzIMyAR94K9ykPTGE+5aA2bVDWbRnELfPV5M8ayy+LAazxSg4xpht3L5Brv2xR8BjaqwEh2hZ5SZKFMJGVN3eR8GJsfPZq1YfV0XDpL7rEL6QVwCyA41MZLZpzULCgl0JYBl1xR+hQa25Sbm7edwsFK3Al7Qx+ZmtX5aPK2ExYcyQIT2r5S30vIQ69ICBqkJ/U9SjF8Xw9LW6RWvi8J+nlBAWsllXKp1tEbtQLVA8tDu9ceckBlrhNNdKCwYhp/Les/80bmdwPMqdTz22rgguDH2IcCbgGxWaag0oVeQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel.onmicrosoft.com; s=selector2-intel-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=KW9R33e+Z3sai6jC6jKfjZ7CRrFZLctnmngDw15F0aY=; b=YLqzQ9Bush0FvufTz1TVF2K6aij5COp5Vp6E085FehNVRNK9CT2Lts+QT3W0vIJuaqcmCXIo2pt/ieJcaDdDwDMJYoQKkFKenC2qyUzzvPKfsMrMgZSK0RmJuXQQt501tHSQTJ8vn2JZItgKqSgnVLFzRktCNT1YwtEHitciTXE= Received: from DM6PR11MB4491.namprd11.prod.outlook.com (2603:10b6:5:204::19) by DM6PR11MB4250.namprd11.prod.outlook.com (2603:10b6:5:1df::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4566.17; Fri, 1 Oct 2021 21:30:09 +0000 Received: from DM6PR11MB4491.namprd11.prod.outlook.com ([fe80::740e:126e:c785:c8fd]) by DM6PR11MB4491.namprd11.prod.outlook.com ([fe80::740e:126e:c785:c8fd%4]) with mapi id 15.20.4566.014; Fri, 1 Oct 2021 21:30:09 +0000 From: "Ananyev, Konstantin" To: Dharmik Thakkar , Olivier Matz , Andrew Rybchenko CC: "dev@dpdk.org" , "nd@arm.com" , "honnappa.nagarahalli@arm.com" , "ruifeng.wang@arm.com" Thread-Topic: [dpdk-dev] [RFC] mempool: implement index-based per core cache Thread-Index: AQHXtiCSje19dgSVGUWYbrC84aN826u+qP0Q Date: Fri, 1 Oct 2021 21:30:08 +0000 Message-ID: References: <20210930172735.2675627-1-dharmik.thakkar@arm.com> In-Reply-To: <20210930172735.2675627-1-dharmik.thakkar@arm.com> Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-reaction: no-action dlp-version: 11.6.200.16 authentication-results: arm.com; dkim=none (message not signed) header.d=none;arm.com; dmarc=none action=none header.from=intel.com; x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 2697c5b9-0ad7-4131-7511-08d98522a4ec x-ms-traffictypediagnostic: DM6PR11MB4250: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:8273; x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: 8YsenpCf52/+arsfQ3c6UI3dXD1CuQqh1uWEjR7w9z+C89sTNBE9PKKziA2kWDcY99v9YantbS2IrGquLmMoz20Zc3ITR7j3jioag/8IfYVlhc4eIxKy6JNbE27+oK3lekra6xIKTI+DJzbnM6IcbigZUHjGSyL+H+NJju+G5LuN/olG+pDoxUxiBushiywrB47FUNAq4p3iBDorJmorHZquseDj+T4Jp5zePvw2phlZ7q4OdbEtulVa1bUY3Lo9+8Pbw3+Tmq/pmU/0yepA8UD7IgWnN20E4ctzCVQvtOOzn+FqYi/frnme/f4kwgMSusqSpbcqS7odoH8tNVtdKZ0EvoRcQqdWJ8joh7oT90j1dvuWHdeM78EkWBFCzW796DQBT81PmqvEpf316D/5mfWUMMA1Z5JuO0TTI8wMMXMhSqm4is8/Ly43Lwje7OjDF4220Tp2Gu/FqFbdhVW2oW+Shbs/kDgeWyMsobydBW6jMeFLvJb4P67XPhz7rN3opSNrpMVWEtOfgrGBFKYuHdE1ScrK+/ss4O+L9NuzvMkcYZA0Y0f6DmYtS9fakN0i2xCPpdSsCLB+HMXi/agJjA/qE3uZybr2AYCFCF5V5mIxcvqM+nBWY4ddNQnlhK/qQhiIkntVKNIXVjugy/Nx+fnsUJsuqyZfn/iIHNmqsg5dlltByiUehRNc3FUJgMomRSfZpGrvLVV5mLNm9fxTjA== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DM6PR11MB4491.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(366004)(64756008)(83380400001)(186003)(71200400001)(110136005)(8936002)(86362001)(4326008)(52536014)(122000001)(38100700002)(54906003)(508600001)(66946007)(66476007)(2906002)(66446008)(33656002)(55236004)(38070700005)(26005)(5660300002)(316002)(7696005)(66556008)(9686003)(76116006)(6506007)(55016002)(8676002); DIR:OUT; SFP:1102; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?gIdLA0MKm6d6eDKhxxZXNBMmpLC+a6gKljvmskJ2eUTRzJ9p9xFqfvmu62Nl?= =?us-ascii?Q?E1gIDtPYbG8z38jqIjp9qaBLV33CtmN66lu3KiBGMixv4xA9KA0S8CSxa7h3?= =?us-ascii?Q?bwkcoJcwv83WIP5YKHDwTYoqIs+IbUoJz7BIGI7YO/CjiWxEHNxXZvlsmC2Y?= =?us-ascii?Q?Ves2SoOo6TBQErslORVTSScRZaiqnn+0A1d+L/wul+zCY953CogbcGks80bc?= =?us-ascii?Q?WU5XSoNvjOZdtz1xx7KrFpILgC3jV8NS2KTppXUuUQSymhnHKEkhGEVZ0orJ?= =?us-ascii?Q?rJVsd6uC6ab5IstMsDYKfc+H+cWlz9ToFKcOgyNsgK0/EWImIYBlaitZT05f?= =?us-ascii?Q?jBMBZp66zlun208ganpf76pl4s9B2wTgh0XH4yrW94iSuZSLM2n8xHzdTFll?= =?us-ascii?Q?vqEgb5I47OuFbjYxooMdupq/dNSBGqCBVx0N7BIgDUQrL0EgkCM4hQMt+YSU?= =?us-ascii?Q?z2af3X+YRBY2nDVIHS2GFbgTh/J0Vv3UZB8KclfGdDhpfO5tSfujIRu60wIN?= =?us-ascii?Q?vk2ucw1nBHaltrU95j9y6BGGL//nJHxOsPU7xUfhkoRcP4AYFagPkBtTLkRt?= =?us-ascii?Q?SClWFax2qrZmIlq6YjMUvJVhUzBov/HFyLoSrrPxb2WLeF64S0SnF1U4wdr3?= =?us-ascii?Q?y0KY1+ivHTVfXAIT8UbpkPBlGQuFNhikxi6vpvRCEkm37jrQfIyf2ef2t01J?= =?us-ascii?Q?rHagnW4bf/x8gvH7ui10IvEczOoM2jaiHHVV5Zz28s7CwgZa0Lw7SHHVUYBS?= =?us-ascii?Q?vH7cvmJYAbPpucOjhLBf1m8+tm5zvtebHwhdzEZmhEZZ7ESBsMZ6awBhTQhz?= =?us-ascii?Q?1nyikwhBubBxJOyPSxkSOkpaeHYz9Du4SdtfWFp4o8wayzImjfV8uyU2zWjO?= =?us-ascii?Q?YnPvGemR9KFxcugsV9Decz/vxO1fCc+Bx9YWt6HM1qqiTwpiifvHm/HzXwDy?= =?us-ascii?Q?9F6kdIqo1RYQfzrvGIXiatoMadlKPILfmk2MPtN5fjP61W1RmT56xTadhG7K?= =?us-ascii?Q?VYTNtDvFmi9r8pXS5Q/XRU4W87OCm3pNnxDSaI8JWIdeHBi1Cdb5kCxfHcT1?= =?us-ascii?Q?kgMv9wKkQ684u0qR1FB2aZOaVferJ/Znjasli0qMevNeqfufL3zY96yXaxqX?= =?us-ascii?Q?igc6pZsgqH2sCmby/LNRCkSImEOLHTHgVxQMzzKuO2VsIhpCyLmNBHOjiYrS?= =?us-ascii?Q?/6+M5nT9SekVCwnHQubLiCBShOz1Ve0Y//MO3qLvCyiKazYr6wYZttGkxGx4?= =?us-ascii?Q?wieroIbSraKeg0UOBX717gsfRma0mDdGq8w2n4gaplmSrHf31CGKTQevYYGm?= =?us-ascii?Q?dNFoO02CihXDemzNBuqtRkPs?= x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: DM6PR11MB4491.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 2697c5b9-0ad7-4131-7511-08d98522a4ec X-MS-Exchange-CrossTenant-originalarrivaltime: 01 Oct 2021 21:30:09.0678 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: ooamvGPoSpvjHUMz5kWtnZbgkEDGCwEvhJbSc7yTHvnE99kaDjQ5QdztLlakDLDv3ByZR81FJj4y1UdmJu+nEDLD3F7UQxT8VHpA+JRb3VE= X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM6PR11MB4250 X-OriginatorOrg: intel.com Subject: Re: [dpdk-dev] [RFC] mempool: implement index-based per core cache X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > Current mempool per core cache implementation is based on pointer > For most architectures, each pointer consumes 64b > Replace it with index-based implementation, where in each buffer > is addressed by (pool address + index) I don't think it is going to work: On 64-bit systems difference between pool address and it's elem address could be bigger than 4GB. =20 > It will reduce memory requirements >=20 > L3Fwd performance testing reveals minor improvements in the cache > performance and no change in throughput >=20 > Micro-benchmarking the patch using mempool_perf_test shows > significant improvement with majority of the test cases >=20 > Future plan involves replacing global pool's pointer-based implementation= with index-based implementation >=20 > Signed-off-by: Dharmik Thakkar > --- > drivers/mempool/ring/rte_mempool_ring.c | 2 +- > lib/mempool/rte_mempool.c | 8 +++ > lib/mempool/rte_mempool.h | 74 ++++++++++++++++++++++--- > 3 files changed, 74 insertions(+), 10 deletions(-) >=20 > diff --git a/drivers/mempool/ring/rte_mempool_ring.c b/drivers/mempool/ri= ng/rte_mempool_ring.c > index b1f09ff28f4d..e55913e47f21 100644 > --- a/drivers/mempool/ring/rte_mempool_ring.c > +++ b/drivers/mempool/ring/rte_mempool_ring.c > @@ -101,7 +101,7 @@ ring_alloc(struct rte_mempool *mp, uint32_t rg_flags) > return -rte_errno; >=20 > mp->pool_data =3D r; > - > + mp->local_cache_base_addr =3D &r[1]; > return 0; > } >=20 > diff --git a/lib/mempool/rte_mempool.c b/lib/mempool/rte_mempool.c > index 59a588425bd6..424bdb19c323 100644 > --- a/lib/mempool/rte_mempool.c > +++ b/lib/mempool/rte_mempool.c > @@ -480,6 +480,7 @@ rte_mempool_populate_default(struct rte_mempool *mp) > int ret; > bool need_iova_contig_obj; > size_t max_alloc_size =3D SIZE_MAX; > + unsigned lcore_id; >=20 > ret =3D mempool_ops_alloc_once(mp); > if (ret !=3D 0) > @@ -600,6 +601,13 @@ rte_mempool_populate_default(struct rte_mempool *mp) > } > } >=20 > + /* Init all default caches. */ > + if (mp->cache_size !=3D 0) { > + for (lcore_id =3D 0; lcore_id < RTE_MAX_LCORE; lcore_id++) > + mp->local_cache[lcore_id].local_cache_base_value =3D > + *(void **)mp->local_cache_base_addr; > + } > + > rte_mempool_trace_populate_default(mp); > return mp->size; >=20 > diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h > index 4235d6f0bf2b..545405c0d3ce 100644 > --- a/lib/mempool/rte_mempool.h > +++ b/lib/mempool/rte_mempool.h > @@ -51,6 +51,8 @@ > #include > #include >=20 > +#include > + > #include "rte_mempool_trace_fp.h" >=20 > #ifdef __cplusplus > @@ -91,11 +93,12 @@ struct rte_mempool_cache { > uint32_t size; /**< Size of the cache */ > uint32_t flushthresh; /**< Threshold before we flush excess elements */ > uint32_t len; /**< Current cache count */ > + void *local_cache_base_value; /**< Base value to calculate indices */ > /* > * Cache is allocated to this size to allow it to overflow in certain > * cases to avoid needless emptying of cache. > */ > - void *objs[RTE_MEMPOOL_CACHE_MAX_SIZE * 3]; /**< Cache objects */ > + uint32_t objs[RTE_MEMPOOL_CACHE_MAX_SIZE * 3]; /**< Cache objects */ > } __rte_cache_aligned; >=20 > /** > @@ -172,7 +175,6 @@ struct rte_mempool_objtlr { > * A list of memory where objects are stored > */ > STAILQ_HEAD(rte_mempool_memhdr_list, rte_mempool_memhdr); > - > /** > * Callback used to free a memory chunk > */ > @@ -244,6 +246,7 @@ struct rte_mempool { > int32_t ops_index; >=20 > struct rte_mempool_cache *local_cache; /**< Per-lcore local cache */ > + void *local_cache_base_addr; /**< Reference to the base value */ >=20 > uint32_t populated_size; /**< Number of populated objects. */ > struct rte_mempool_objhdr_list elt_list; /**< List of objects in pool *= / > @@ -1269,7 +1272,15 @@ rte_mempool_cache_flush(struct rte_mempool_cache *= cache, > if (cache =3D=3D NULL || cache->len =3D=3D 0) > return; > rte_mempool_trace_cache_flush(cache, mp); > - rte_mempool_ops_enqueue_bulk(mp, cache->objs, cache->len); > + > + unsigned int i; > + unsigned int cache_len =3D cache->len; > + void *obj_table[RTE_MEMPOOL_CACHE_MAX_SIZE * 3]; > + void *base_value =3D cache->local_cache_base_value; > + uint32_t *cache_objs =3D cache->objs; > + for (i =3D 0; i < cache_len; i++) > + obj_table[i] =3D (void *) RTE_PTR_ADD(base_value, cache_objs[i]); > + rte_mempool_ops_enqueue_bulk(mp, obj_table, cache->len); > cache->len =3D 0; > } >=20 > @@ -1289,7 +1300,9 @@ static __rte_always_inline void > __mempool_generic_put(struct rte_mempool *mp, void * const *obj_table, > unsigned int n, struct rte_mempool_cache *cache) > { > - void **cache_objs; > + uint32_t *cache_objs; > + void *base_value; > + uint32_t i; >=20 > /* increment stat now, adding in mempool always success */ > __MEMPOOL_STAT_ADD(mp, put_bulk, 1); > @@ -1301,6 +1314,12 @@ __mempool_generic_put(struct rte_mempool *mp, void= * const *obj_table, >=20 > cache_objs =3D &cache->objs[cache->len]; >=20 > + base_value =3D cache->local_cache_base_value; > + > + uint64x2_t v_obj_table; > + uint64x2_t v_base_value =3D vdupq_n_u64((uint64_t)base_value); > + uint32x2_t v_cache_objs; > + > /* > * The cache follows the following algorithm > * 1. Add the objects to the cache > @@ -1309,12 +1328,26 @@ __mempool_generic_put(struct rte_mempool *mp, voi= d * const *obj_table, > */ >=20 > /* Add elements back into the cache */ > - rte_memcpy(&cache_objs[0], obj_table, sizeof(void *) * n); > + > +#if defined __ARM_NEON > + for (i =3D 0; i < (n & ~0x1); i+=3D2) { > + v_obj_table =3D vld1q_u64((const uint64_t *)&obj_table[i]); > + v_cache_objs =3D vqmovn_u64(vsubq_u64(v_obj_table, v_base_value)); > + vst1_u32(cache_objs + i, v_cache_objs); > + } > + if (n & 0x1) { > + cache_objs[i] =3D (uint32_t) RTE_PTR_DIFF(obj_table[i], base_value); > + } > +#else > + for (i =3D 0; i < n; i++) { > + cache_objs[i] =3D (uint32_t) RTE_PTR_DIFF(obj_table[i], base_value); > + } > +#endif >=20 > cache->len +=3D n; >=20 > if (cache->len >=3D cache->flushthresh) { > - rte_mempool_ops_enqueue_bulk(mp, &cache->objs[cache->size], > + rte_mempool_ops_enqueue_bulk(mp, obj_table + cache->len - cache->size, > cache->len - cache->size); > cache->len =3D cache->size; > } > @@ -1415,23 +1448,26 @@ __mempool_generic_get(struct rte_mempool *mp, voi= d **obj_table, > unsigned int n, struct rte_mempool_cache *cache) > { > int ret; > + uint32_t i; > uint32_t index, len; > - void **cache_objs; > + uint32_t *cache_objs; >=20 > /* No cache provided or cannot be satisfied from cache */ > if (unlikely(cache =3D=3D NULL || n >=3D cache->size)) > goto ring_dequeue; >=20 > + void *base_value =3D cache->local_cache_base_value; > cache_objs =3D cache->objs; >=20 > /* Can this be satisfied from the cache? */ > if (cache->len < n) { > /* No. Backfill the cache first, and then fill from it */ > uint32_t req =3D n + (cache->size - cache->len); > + void *temp_objs[RTE_MEMPOOL_CACHE_MAX_SIZE * 3]; /**< Cache objects */ >=20 > /* How many do we require i.e. number to fill the cache + the request = */ > ret =3D rte_mempool_ops_dequeue_bulk(mp, > - &cache->objs[cache->len], req); > + temp_objs, req); > if (unlikely(ret < 0)) { > /* > * In the off chance that we are buffer constrained, > @@ -1442,12 +1478,32 @@ __mempool_generic_get(struct rte_mempool *mp, voi= d **obj_table, > goto ring_dequeue; > } >=20 > + len =3D cache->len; > + for (i =3D 0; i < req; ++i, ++len) { > + cache_objs[len] =3D (uint32_t) RTE_PTR_DIFF(temp_objs[i], base_value)= ; > + } > + > cache->len +=3D req; > } >=20 > + uint64x2_t v_obj_table; > + uint64x2_t v_cache_objs; > + uint64x2_t v_base_value =3D vdupq_n_u64((uint64_t)base_value); > + > /* Now fill in the response ... */ > +#if defined __ARM_NEON > + for (index =3D 0, len =3D cache->len - 1; index < (n & ~0x1); index+=3D= 2, > + len-=3D2, obj_table+=3D2) { > + v_cache_objs =3D vmovl_u32(vld1_u32(cache_objs + len - 1)); > + v_obj_table =3D vaddq_u64(v_cache_objs, v_base_value); > + vst1q_u64((uint64_t *)obj_table, v_obj_table); > + } > + if (n & 0x1) > + *obj_table =3D (void *) RTE_PTR_ADD(base_value, cache_objs[len]); > +#else > for (index =3D 0, len =3D cache->len - 1; index < n; ++index, len--, ob= j_table++) > - *obj_table =3D cache_objs[len]; > + *obj_table =3D (void *) RTE_PTR_ADD(base_value, cache_objs[len]); > +#endif >=20 > cache->len -=3D n; >=20 > -- > 2.17.1