From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, URIBL_SBL,URIBL_SBL_A autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 173DAC28CC5 for ; Wed, 5 Jun 2019 18:21:22 +0000 (UTC) Received: from dpdk.org (dpdk.org [92.243.14.124]) by mail.kernel.org (Postfix) with ESMTP id 8290B2075C for ; Wed, 5 Jun 2019 18:21:20 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8290B2075C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=dev-bounces@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id DDDF61B95B; Wed, 5 Jun 2019 20:21:19 +0200 (CEST) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id 6978B2C6A for ; Wed, 5 Jun 2019 20:21:17 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Jun 2019 11:21:16 -0700 X-ExtLoop1: 1 Received: from fmsmsx107.amr.corp.intel.com ([10.18.124.205]) by fmsmga007.fm.intel.com with ESMTP; 05 Jun 2019 11:21:16 -0700 Received: from fmsmsx121.amr.corp.intel.com (10.18.125.36) by fmsmsx107.amr.corp.intel.com (10.18.124.205) with Microsoft SMTP Server (TLS) id 14.3.408.0; Wed, 5 Jun 2019 11:21:16 -0700 Received: from fmsmsx108.amr.corp.intel.com ([169.254.9.181]) by fmsmsx121.amr.corp.intel.com ([169.254.6.125]) with mapi id 14.03.0415.000; Wed, 5 Jun 2019 11:21:15 -0700 From: "Eads, Gage" To: Ola Liljedahl , "dev@dpdk.org" , Honnappa Nagarahalli , "Richardson, Bruce" CC: nd Thread-Topic: [RFC,v2] lfring: lock-free ring buffer Thread-Index: AQHUuLnsVjNyP3ffWk++amgxuRaei6aOD9Bw Date: Wed, 5 Jun 2019 18:21:14 +0000 Message-ID: <9184057F7FC11744A2107296B6B8EB1E68CD76ED@FMSMSX108.amr.corp.intel.com> References: <1548866179-20766-1-git-send-email-ola.liljedahl@arm.com> In-Reply-To: <1548866179-20766-1-git-send-email-ola.liljedahl@arm.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiOGI3MzMwYTUtNTYwOS00Yzg2LWE4M2UtYjQzYmU5YmY5ZGRmIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX05UIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE3LjEwLjE4MDQuNDkiLCJUcnVzdGVkTGFiZWxIYXNoIjoiUGFxRXJySGJ2MWYzRUtrZUlnM0FFT3RMT00wQ0czZWFGMFwvMjFJMWl0UHE3eEJoNHhmK1FvcWttZkZCZlJaTnAifQ== x-ctpclassification: CTP_NT dlp-product: dlpe-windows dlp-version: 11.0.600.7 dlp-reaction: no-action x-originating-ip: [10.1.200.106] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [RFC,v2] lfring: lock-free ring buffer X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi Ola, Is it possible to add burst enqueue and dequeue functions as well, so we ca= n plug this into a mempool handler? Besides the mempool handler, I think th= e all-or-nothing semantics would be useful for applications. Besides that, = this RFC looks good at a high level. For a complete submission, a few more changes are needed: - Builds: Need to add 'lfring' to lib/meson.build and mk/rte.app.mk - Documentation: need to update release notes, add a new section in the pro= grammer's guide, and update the doxygen configuration files - Tests: This patchset should add a set of lfring tests as well Code comments follow. =20 > diff --git a/lib/Makefile b/lib/Makefile index d6239d2..cd46339 100644 > --- a/lib/Makefile > +++ b/lib/Makefile > @@ -12,6 +12,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_PCI) +=3D librte_pci > DEPDIRS-librte_pci :=3D librte_eal > DIRS-$(CONFIG_RTE_LIBRTE_RING) +=3D librte_ring DEPDIRS-librte_ring := =3D > librte_eal > +DIRS-$(CONFIG_RTE_LIBRTE_RING) +=3D librte_lfring DEPDIRS-librte_lfring > +:=3D librte_eal LIBRTE_RING -> LIBRTE_LFRING > DIRS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=3D librte_mempool DEPDIRS- > librte_mempool :=3D librte_eal librte_ring > DIRS-$(CONFIG_RTE_LIBRTE_MBUF) +=3D librte_mbuf diff --git > a/lib/librte_lfring/Makefile b/lib/librte_lfring/Makefile new file mode 1= 00644 > index 0000000..c04d2e9 > --- /dev/null > +++ b/lib/librte_lfring/Makefile > @@ -0,0 +1,22 @@ > +# SPDX-License-Identifier: BSD-3-Clause # Copyright(c) 2010-2014 Intel > +Corporation Need to correct the copyright > + > +include $(RTE_SDK)/mk/rte.vars.mk > + > +# library name > +LIB =3D librte_lfring.a > + > +CFLAGS +=3D $(WERROR_FLAGS) -I$(SRCDIR) -O3 LDLIBS +=3D -lrte_eal > + > +EXPORT_MAP :=3D rte_lfring_version.map > + > +LIBABIVER :=3D 1 > + > +# all source are stored in SRCS-y > +SRCS-$(CONFIG_RTE_LIBRTE_RING) :=3D rte_lfring.c LIBRTE_RING -> LIBRTE_LFRING > + > +# install includes > +SYMLINK-$(CONFIG_RTE_LIBRTE_RING)-include :=3D rte_lfring.h lockfree16.h LIBRTE_RING -> LIBRTE_LFRING > + > +include $(RTE_SDK)/mk/rte.lib.mk > diff --git a/lib/librte_lfring/lockfree16.h b/lib/librte_lfring/lockfree1= 6.h new > file mode 100644 index 0000000..199add9 > --- /dev/null > +++ b/lib/librte_lfring/lockfree16.h This needs to be refactored to use the rte_atomic128_cmp_exchange interface= . > diff --git a/lib/librte_lfring/meson.build b/lib/librte_lfring/meson.buil= d new > file mode 100644 index 0000000..c8b90cb > --- /dev/null > +++ b/lib/librte_lfring/meson.build > @@ -0,0 +1,6 @@ > +# SPDX-License-Identifier: BSD-3-Clause # Copyright(c) 2017 Intel > +Corporation > + Correct copyright > +version =3D 1 > +sources =3D files('rte_lfring.c') > +headers =3D files('rte_lfring.h','lockfree16.h') > diff --git a/lib/librte_lfring/rte_lfring.c b/lib/librte_lfring/rte_lfrin= g.c new file > mode 100644 index 0000000..e130f71 > --- /dev/null > +++ b/lib/librte_lfring/rte_lfring.c > @@ -0,0 +1,264 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * > + * Copyright (c) 2010-2015 Intel Corporation > + * Copyright (c) 2019 Arm Ltd > + * All rights reserved. > + */ > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + You can reduce the system and DPDK includes down to: #include #include =09 #include #include #include #include #include #include #include #include > +#include "rte_lfring.h" > + > +TAILQ_HEAD(rte_lfring_list, rte_tailq_entry); > + > +static struct rte_tailq_elem rte_lfring_tailq =3D { > + .name =3D RTE_TAILQ_LFRING_NAME, > +}; > +EAL_REGISTER_TAILQ(rte_lfring_tailq) > + > +/* true if x is a power of 2 */ > +#define POWEROF2(x) ((((x)-1) & (x)) =3D=3D 0) DPDK provides this functionality with RTE_IS_POWER_OF_2(), in rte_common.h > + > +/* return the size of memory occupied by a ring */ ssize_t > +rte_lfring_get_memsize(unsigned int count) { > + ssize_t sz; > + > + /* count must be a power of 2 */ > + if ((!POWEROF2(count)) || (count > 0x80000000U)) { > + RTE_LOG(ERR, RING, This library needs to replace the RING logging with its own LFRING logging. > +/* create the ring */ The comments before function definitions (e.g. "create the ring", "free the= ring") probably aren't necessary, the names are self-explanatory. > +struct rte_lfring * > +rte_lfring_create(const char *name, unsigned int count, int socket_id, > + unsigned int flags) > +{ > + char mz_name[RTE_MEMZONE_NAMESIZE]; > + struct rte_lfring *r; > + struct rte_tailq_entry *te; > + const struct rte_memzone *mz; > + ssize_t ring_size; > + int mz_flags =3D 0; > + struct rte_lfring_list *ring_list =3D NULL; > + const unsigned int requested_count =3D count; > + int ret; > + > + ring_list =3D RTE_TAILQ_CAST(rte_lfring_tailq.head, rte_lfring_list); > + > + count =3D rte_align32pow2(count); > + > + ring_size =3D rte_lfring_get_memsize(count); > + if (ring_size < 0) { > + rte_errno =3D ring_size; > + return NULL; > + } > + > + ret =3D snprintf(mz_name, sizeof(mz_name), "%s%s", > + RTE_LFRING_MZ_PREFIX, name); > + if (ret < 0 || ret >=3D (int)sizeof(mz_name)) { > + rte_errno =3D ENAMETOOLONG; > + return NULL; > + } > + > + te =3D rte_zmalloc("LFRING_TAILQ_ENTRY", sizeof(*te), 0); > + if (te =3D=3D NULL) { > + RTE_LOG(ERR, RING, "Cannot reserve memory for tailq\n"); > + rte_errno =3D ENOMEM; > + return NULL; > + } > + > + rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK); > + > + /* reserve a memory zone for this ring. If we can't get rte_config or > + * we are secondary process, the memzone_reserve function will set > + * rte_errno for us appropriately - hence no check in this this > + * function > + */ The "or we are secondary process" comment is out-of-date; the memzone reser= ve functions can be called by a secondary process. (For that matter, the do= cumentation in rte_memzone.h is out-of-date as well.) This restriction was = removed in commit 916e4f4f4e45. > +/* free the ring */ > +void > +rte_lfring_free(struct rte_lfring *r) > +{ > + struct rte_lfring_list *ring_list =3D NULL; > + struct rte_tailq_entry *te; > + > + if (r =3D=3D NULL) > + return; > + > + /* > + * Ring was not created with rte_lfring_create, > + * therefore, there is no memzone to free. > + */ > + if (r->memzone =3D=3D NULL) { > + RTE_LOG(ERR, RING, "Cannot free ring (not created with > rte_lfring_create()"); > + return; > + } > + > + if (rte_memzone_free(r->memzone) !=3D 0) { > + RTE_LOG(ERR, RING, "Cannot free memory\n"); > + return; > + } Better to free the ring's memzone after taking it off the list, in the unli= kely event another thread looks it up at the same time and attempts to use = it. > + > + ring_list =3D RTE_TAILQ_CAST(rte_lfring_tailq.head, rte_lfring_list); > + rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK); > + > + /* find out tailq entry */ > + TAILQ_FOREACH(te, ring_list, next) { > + if (te->data =3D=3D (void *) r) > + break; > + } > + > + if (te =3D=3D NULL) { > + rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK); > + return; > + } > + > + TAILQ_REMOVE(ring_list, te, next); > + > + rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK); > + > + rte_free(te); > +} > +/* search a ring from its name */ > +struct rte_lfring * > +rte_lfring_lookup(const char *name) > +{ > + struct rte_tailq_entry *te; > + struct rte_lfring *r =3D NULL; > + struct rte_lfring_list *ring_list; > + > + ring_list =3D RTE_TAILQ_CAST(rte_lfring_tailq.head, rte_lfring_list); > + > + rte_rwlock_read_lock(RTE_EAL_TAILQ_RWLOCK); > + > + TAILQ_FOREACH(te, ring_list, next) { > + r =3D (struct rte_lfring *) te->data; > + if (strncmp(name, r->name, RTE_LFRING_NAMESIZE) =3D=3D 0) Missing a NULL pointer check before dereferencing 'name' > + break; > + } > + > + rte_rwlock_read_unlock(RTE_EAL_TAILQ_RWLOCK); > + > + if (te =3D=3D NULL) { > + rte_errno =3D ENOENT; > + return NULL; > + } > + > + return r; > +} > diff --git a/lib/librte_lfring/rte_lfring.h b/lib/librte_lfring/rte_lfrin= g.h new file > mode 100644 index 0000000..22d3e1c > --- /dev/null > +++ b/lib/librte_lfring/rte_lfring.h > @@ -0,0 +1,621 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * > + * Copyright (c) 2010-2017 Intel Corporation > + * Copyright (c) 2019 Arm Ltd > + * All rights reserved. > + */ > + > +#ifndef _RTE_LFRING_H_ > +#define _RTE_LFRING_H_ > + > +/** > + * @file > + * RTE Lfring > + * > + * The Lfring Manager is a fixed-size queue, implemented as a table of > + * (pointers + counters). > + * > + * - FIFO (First In First Out) > + * - Maximum size is fixed; the pointers are stored in a table. > + * - Lock-free implementation. > + * - Multi- or single-consumer dequeue. > + * - Multi- or single-producer enqueue. > + * > + */ > + > +#ifdef __cplusplus > +extern "C" { > +#endif > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + You should be able to clean up the includes here too, e.g. rte_pause.h does= n't appear to be used. > +#include "lockfree16.h" > + > +#define RTE_TAILQ_LFRING_NAME "RTE_LFRING" > + > +#define RTE_LFRING_MZ_PREFIX "RG_" > +/**< The maximum length of an lfring name. */ #define Change "/**<" to "/**", otherwise doxygen applies the comment to RTE_LFRING= _MZ_PREFIX > +RTE_LFRING_NAMESIZE (RTE_MEMZONE_NAMESIZE - \ > + sizeof(RTE_LFRING_MZ_PREFIX) + 1) > + > +struct rte_memzone; /* forward declaration, so as not to require > +memzone.h */ This forward declaration is no longer necessary > +/** > + * Calculate the memory size needed for a lfring > + * > + * This function returns the number of bytes needed for a lfring, given > + * the number of elements in it. This value is the sum of the size of > + * the structure rte_lfring and the size of the memory needed by the > + * objects pointers. The value is aligned to a cache line size. > + * > + * @param count > + * The number of elements in the lfring (must be a power of 2). > + * @return > + * - The memory size needed for the lfring on success. > + * - -EINVAL if count is not a power of 2. Missing error case: -EINVAL if size exceeds size limit > + */ > +ssize_t rte_lfring_get_memsize(unsigned int count); > + > +/** > + * Initialize a lfring structure. > + * > + * Initialize a lfring structure in memory pointed by "r". The size of > +the > + * memory area must be large enough to store the lfring structure and > +the > + * object table. It is advised to use rte_lfring_get_memsize() to get > +the > + * appropriate size. > + * > + * The lfring size is set to *count*, which must be a power of two. > +Water > + * marking is disabled by default. The real usable lfring size is > + * *count-1* instead of *count* to differentiate a free lfring from an > + * empty lfring. "free lfring" -> should say "full lfring"? Same applies to rte_lfring_creat= e(). > + * > + * The lfring is not added in RTE_TAILQ_LFRING global list. Indeed, the > + * memory given by the caller may not be shareable among dpdk > + * processes. Here and in rte_lfring_create(), the documentation mentions whether or not = the lfring is added to the RTE_TAILQ_LFRING list. I don't think this makes = sense in the documentation, as it pertains to the implementation and not th= e interface. > + * > + * @param r > + * The pointer to the lfring structure followed by the objects table. > + * @param name > + * The name of the lfring. > + * @param count > + * The number of elements in the lfring (must be a power of 2). > + * @param flags > + * @return > + * 0 on success, or a negative value on error. > + */ > +int rte_lfring_init(struct rte_lfring *r, const char *name, unsigned int= count, > + unsigned int flags); > + > +/** > + * Create a new lfring named *name* in memory. > + * > + * This function uses ``memzone_reserve()`` to allocate memory. Then it > + * calls rte_lfring_init() to initialize an empty lfring. > + * > + * The new lfring size is set to *count*, which must be a power of > + * two. Water marking is disabled by default. The real usable lfring > +size > + * is *count-1* instead of *count* to differentiate a free lfring from > +an > + * empty lfring. > + * > + * The lfring is added in RTE_TAILQ_LFRING list. > + * > + * @param name > + * The name of the lfring. > + * @param count > + * The size of the lfring (must be a power of 2). > + * @param socket_id > + * The *socket_id* argument is the socket identifier in case of > + * NUMA. The value can be *SOCKET_ID_ANY* if there is no NUMA > + * constraint for the reserved zone. > + * @param flags > + * An OR of the following: > + * - LFRING_F_SP_ENQ: If this flag is set, the default behavior when > + * using ``rte_lfring_enqueue()`` or ``rte_lfring_enqueue_bulk()`` > + * is "single-producer". Otherwise, it is "multi-producers". > + * - LFRING_F_SC_DEQ: If this flag is set, the default behavior when > + * using ``rte_lfring_dequeue()`` or ``rte_lfring_dequeue_bulk()`` > + * is "single-consumer". Otherwise, it is "multi-consumers". > + * @return > + * On success, the pointer to the new allocated lfring. NULL on error = with > + * rte_errno set appropriately. Possible errno values include: > + * - E_RTE_NO_CONFIG - function could not get pointer to rte_config > structure > + * - E_RTE_SECONDARY - function was called from a secondary process > instance E_RTE_SECONDARY and E_RTE_NO_CONFIG are not possible error cases > +#define ENQ_RETRY_LIMIT 32 Per the coding guidelines (https://doc.dpdk.org/guides/contributing/coding_= style.html#macros), macros should be prefixed with RTE_. > +/** > + * Return the number of elements which can be stored in the lfring. > + * > + * @param r > + * A pointer to the lfring structure. > + * @return > + * The usable size of the lfring. > + */ > +static inline unsigned int > +rte_lfring_get_capacity(const struct rte_lfring *r) { > + return r->size; I believe this should return r->mask, to account for the one unusable ring = entry. > diff --git a/lib/librte_lfring/rte_lfring_version.map > b/lib/librte_lfring/rte_lfring_version.map > new file mode 100644 > index 0000000..d935efd > --- /dev/null > +++ b/lib/librte_lfring/rte_lfring_version.map > @@ -0,0 +1,19 @@ > +DPDK_2.0 { > + global: > + > + rte_ring_create; > + rte_ring_dump; > + rte_ring_get_memsize; > + rte_ring_init; > + rte_ring_list_dump; > + rte_ring_lookup; Need to fix function names and DPDK version number Thanks, Gage