All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Rybchenko <arybchenko@solarflare.com>
To: <xiangxia.m.yue@gmail.com>, <olivier.matz@6wind.com>,
	<gage.eads@intel.com>, <artem.andreev@oktetlabs.ru>,
	<jerinj@marvell.com>, <ndabilpuram@marvell.com>,
	<vattunuru@marvell.com>, <hemant.agrawal@nxp.com>,
	 <david.marchand@redhat.com>, <anatoly.burakov@intel.com>,
	<bruce.richardson@intel.com>
Cc: <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH dpdk-dev v3 2/2] mempool: use shared memzone for rte_mempool_ops
Date: Thu, 23 Apr 2020 16:38:44 +0300	[thread overview]
Message-ID: <3b1cb032-a1c2-6dee-af72-8ed690a9ad8c@solarflare.com> (raw)
In-Reply-To: <1586787714-13890-2-git-send-email-xiangxia.m.yue@gmail.com>

On 4/13/20 5:21 PM, xiangxia.m.yue@gmail.com wrote:
> From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> 
> The order of mempool initiation affects mempool index in the
> rte_mempool_ops_table. For example, when building APPs with:
> 
> $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
> 
> The "bucket" mempool will be registered firstly, and its index
> in table is 0 while the index of "ring" mempool is 1. DPDK
> uses the mk/rte.app.mk to build APPs, and others, for example,
> Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> The mempool lib linked in dpdk and Open vSwitch is different.
> 
> The mempool can be used between primary and secondary process,
> such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled).
> There will be a crash because dpdk-pdump creates the "ring_mp_mc"
> ring which index in table is 0, but the index of "bucket" ring
> is 0 in Open vSwitch. If Open vSwitch use the index 0 to get
> mempool ops and malloc memory from mempool. The crash will occur:
> 
>     bucket_dequeue (access null and crash)
>     rte_mempool_get_ops (should get "ring_mp_mc",
>                          but get "bucket" mempool)
>     rte_mempool_ops_dequeue_bulk
>     ...
>     rte_pktmbuf_alloc
>     rte_pktmbuf_copy
>     pdump_copy
>     pdump_rx
>     rte_eth_rx_burst
> 
> To avoid the crash, there are some solution:
> * constructor priority: Different mempool uses different
>   priority in RTE_INIT, but it's not easy to maintain.
> 
> * change mk/rte.app.mk: Change the order in mk/rte.app.mk to
>   be same as libdpdk.a/libdpdk.so, but when adding a new mempool
>   driver in future, we must make sure the order.
> 
> * register mempool orderly: Sort the mempool when registering,
>   so the lib linked will not affect the index in mempool table.
>   but the number of mempool libraries may be different.
> 
> * shared memzone: The primary process allocates a struct in
>   shared memory named memzone, When we register a mempool ops,
>   we first get a name and id from the shared struct: with the lock held,
>   lookup for the registered name and return its index, else
>   get the last id and copy the name in the struct.
> 
> Previous discussion: https://mails.dpdk.org/archives/dev/2020-March/159354.html
> 
> Suggested-by: Olivier Matz <olivier.matz@6wind.com>
> Suggested-by: Jerin Jacob <jerinj@marvell.com>
> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> ---
> v2:
> * fix checkpatch warning
> ---
>  lib/librte_mempool/rte_mempool.h     | 28 +++++++++++-
>  lib/librte_mempool/rte_mempool_ops.c | 89 ++++++++++++++++++++++++++++--------
>  2 files changed, 96 insertions(+), 21 deletions(-)
> 
> diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
> index c90cf31467b2..2709b9e1d51b 100644
> --- a/lib/librte_mempool/rte_mempool.h
> +++ b/lib/librte_mempool/rte_mempool.h
> @@ -50,6 +50,7 @@
>  #include <rte_ring.h>
>  #include <rte_memcpy.h>
>  #include <rte_common.h>
> +#include <rte_init.h>
>  
>  #ifdef __cplusplus
>  extern "C" {
> @@ -678,7 +679,6 @@ struct rte_mempool_ops {
>   */
>  struct rte_mempool_ops_table {
>  	rte_spinlock_t sl;     /**< Spinlock for add/delete. */
> -	uint32_t num_ops;      /**< Number of used ops structs in the table. */
>  	/**
>  	 * Storage for all possible ops structs.
>  	 */
> @@ -910,6 +910,30 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp,
>   */
>  int rte_mempool_register_ops(const struct rte_mempool_ops *ops);
>  
> +struct rte_mempool_shared_ops {
> +	size_t num_mempool_ops;

Is there any specific reason to change type from uint32_t used
above to size_t? I think that uint32_t is better here since
it is just a number, not a size of memory or related value.

> +	struct {
> +		char name[RTE_MEMPOOL_OPS_NAMESIZE];
> +	} mempool_ops[RTE_MEMPOOL_MAX_OPS_IDX];
> +
> +	rte_spinlock_t mempool;
> +};
> +
> +static inline int
> +mempool_ops_register_cb(const void *arg)
> +{
> +	const struct rte_mempool_ops *h = (const struct rte_mempool_ops *)arg;
> +
> +	return rte_mempool_register_ops(h);
> +}
> +
> +static inline void
> +mempool_ops_register(const struct rte_mempool_ops *ops)
> +{
> +	rte_init_register(mempool_ops_register_cb, (const void *)ops,
> +			  RTE_INIT_PRE);
> +}
> +
>  /**
>   * Macro to statically register the ops of a mempool handler.
>   * Note that the rte_mempool_register_ops fails silently here when
> @@ -918,7 +942,7 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp,
>  #define MEMPOOL_REGISTER_OPS(ops)				\
>  	RTE_INIT(mp_hdlr_init_##ops)				\
>  	{							\
> -		rte_mempool_register_ops(&ops);			\
> +		mempool_ops_register(&ops);			\
>  	}
>  
>  /**
> diff --git a/lib/librte_mempool/rte_mempool_ops.c b/lib/librte_mempool/rte_mempool_ops.c
> index 22c5251eb068..b10fda662db6 100644
> --- a/lib/librte_mempool/rte_mempool_ops.c
> +++ b/lib/librte_mempool/rte_mempool_ops.c
> @@ -14,43 +14,95 @@
>  /* indirect jump table to support external memory pools. */
>  struct rte_mempool_ops_table rte_mempool_ops_table = {
>  	.sl =  RTE_SPINLOCK_INITIALIZER,
> -	.num_ops = 0
>  };
>  
> -/* add a new ops struct in rte_mempool_ops_table, return its index. */
> -int
> -rte_mempool_register_ops(const struct rte_mempool_ops *h)
> +static int
> +rte_mempool_register_shared_ops(const char *name)
>  {
> -	struct rte_mempool_ops *ops;
> -	int16_t ops_index;
> +	static bool mempool_shared_ops_inited;
> +	struct rte_mempool_shared_ops *shared_ops;
> +	const struct rte_memzone *mz;
> +	uint32_t ops_index = 0;
> +

I think we should sanity check 'name' here to be not
empty string (see review notes below).

> +	if (rte_eal_process_type() == RTE_PROC_PRIMARY &&
> +	    !mempool_shared_ops_inited) {
> +
> +		mz = rte_memzone_reserve("mempool_ops_shared",
> +					 sizeof(*shared_ops), 0, 0);
> +		if (mz == NULL)
> +			return -ENOMEM;
> +
> +		shared_ops = mz->addr;
> +		shared_ops->num_mempool_ops = 0;
> +		rte_spinlock_init(&shared_ops->mempool);
> +
> +		mempool_shared_ops_inited = true;
> +	} else {
> +		mz = rte_memzone_lookup("mempool_ops_shared");
> +		if (mz == NULL)
> +			return -ENOMEM;
> +
> +		shared_ops = mz->addr;
> +	}
>  
> -	rte_spinlock_lock(&rte_mempool_ops_table.sl);
> +	rte_spinlock_lock(&shared_ops->mempool);
>  
> -	if (rte_mempool_ops_table.num_ops >=
> -			RTE_MEMPOOL_MAX_OPS_IDX) {
> -		rte_spinlock_unlock(&rte_mempool_ops_table.sl);
> +	if (shared_ops->num_mempool_ops >= RTE_MEMPOOL_MAX_OPS_IDX) {
> +		rte_spinlock_unlock(&shared_ops->mempool);
>  		RTE_LOG(ERR, MEMPOOL,
>  			"Maximum number of mempool ops structs exceeded\n");
>  		return -ENOSPC;
>  	}
>  
> +	while (shared_ops->mempool_ops[ops_index].name[0]) {

Please, compare with '\0' as DPDK style guide says.

> +		if (!strcmp(name, shared_ops->mempool_ops[ops_index].name)) {
> +			rte_spinlock_unlock(&shared_ops->mempool);
> +			return ops_index;
> +		}
> +
> +		ops_index++;
> +	}
> +
> +	strlcpy(shared_ops->mempool_ops[ops_index].name, name,
> +		sizeof(shared_ops->mempool_ops[0].name));
> +
> +	shared_ops->num_mempool_ops++;
> +
> +	rte_spinlock_unlock(&shared_ops->mempool);
> +	return ops_index;
> +}
> +
> +/* add a new ops struct in rte_mempool_ops_table, return its index. */
> +int
> +rte_mempool_register_ops(const struct rte_mempool_ops *h)
> +{
> +	struct rte_mempool_ops *ops;
> +	int16_t ops_index;
> +
>  	if (h->alloc == NULL || h->enqueue == NULL ||
> -			h->dequeue == NULL || h->get_count == NULL) {
> -		rte_spinlock_unlock(&rte_mempool_ops_table.sl);
> +	    h->dequeue == NULL || h->get_count == NULL) {

Changing formatting just makes review a bit more harder.

>  		RTE_LOG(ERR, MEMPOOL,
>  			"Missing callback while registering mempool ops\n");
> +		rte_errno = EINVAL;

Why is it done in the patch? For me it looks like logically
different change if it is really required (properly motivated).

>  		return -EINVAL;
>  	}
>  
>  	if (strlen(h->name) >= sizeof(ops->name) - 1) {
> -		rte_spinlock_unlock(&rte_mempool_ops_table.sl);
> -		RTE_LOG(DEBUG, EAL, "%s(): mempool_ops <%s>: name too long\n",
> -				__func__, h->name);
> +		RTE_LOG(ERR, MEMPOOL,
> +			"The registering  mempool_ops <%s>: name too long\n",
> +			h->name);

Why do you change from DEBUG to ERR here? It it not
directly related to the purpose of the patch.

>  		rte_errno = EEXIST;
>  		return -EEXIST;
>  	}
>  
> -	ops_index = rte_mempool_ops_table.num_ops++;
> +	ops_index = rte_mempool_register_shared_ops(h->name);
> +	if (ops_index < 0) {
> +		rte_errno = -ops_index;
> +		return ops_index;
> +	}
> +
> +	rte_spinlock_lock(&rte_mempool_ops_table.sl);
> +
>  	ops = &rte_mempool_ops_table.ops[ops_index];
>  	strlcpy(ops->name, h->name, sizeof(ops->name));
>  	ops->alloc = h->alloc;
> @@ -165,9 +217,8 @@ struct rte_mempool_ops_table rte_mempool_ops_table = {
>  	if (mp->flags & MEMPOOL_F_POOL_CREATED)
>  		return -EEXIST;
>  
> -	for (i = 0; i < rte_mempool_ops_table.num_ops; i++) {
> -		if (!strcmp(name,
> -				rte_mempool_ops_table.ops[i].name)) {
> +	for (i = 0; i < RTE_MEMPOOL_MAX_OPS_IDX; i++) {
> +		if (!strcmp(name, rte_mempool_ops_table.ops[i].name)) {

Since rte_mempool_ops_table is filled in which zeros,
name string is empty by default. So, request with empty name
will match the first unused entry. I guess it is not what we
want here. I think we should handle empty string before the
loop and return -EINVAL.

>  			ops = &rte_mempool_ops_table.ops[i];
>  			break;
>  		}
> 


  parent reply	other threads:[~2020-04-23 13:39 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-02  1:57 [dpdk-dev] [PATCH] mempool: sort the rte_mempool_ops by name xiangxia.m.yue
2020-03-02 13:45 ` Jerin Jacob
2020-03-04 13:17   ` Tonghao Zhang
2020-03-04 13:33     ` Jerin Jacob
2020-03-04 14:46       ` Tonghao Zhang
2020-03-04 15:14         ` Jerin Jacob
2020-03-04 15:25           ` Tonghao Zhang
2020-03-05  8:20 ` [dpdk-dev] [PATCH dpdk-dev v2] " xiangxia.m.yue
2020-03-05 16:57   ` Olivier Matz
2020-03-06 13:36 ` [dpdk-dev] [PATCH dpdk-dev v3] " xiangxia.m.yue
2020-03-06 13:37   ` Jerin Jacob
2020-03-07 12:51     ` Andrew Rybchenko
2020-03-07 12:54       ` Andrew Rybchenko
2020-03-09  3:01         ` Tonghao Zhang
2020-03-09  8:27           ` Olivier Matz
2020-03-09  8:55             ` Tonghao Zhang
2020-03-09  9:05               ` Olivier Matz
2020-03-09 13:15               ` David Marchand
2020-03-16  7:43                 ` Tonghao Zhang
2020-03-16  7:55                   ` Olivier Matz
2020-03-24  9:35             ` Andrew Rybchenko
2020-03-24 12:41               ` Tonghao Zhang
2020-04-09 10:52 ` [dpdk-dev] [PATCH dpdk-dev 1/2] eal: introduce last-init queue for libraries initialization xiangxia.m.yue
2020-04-09 10:53   ` [dpdk-dev] [PATCH dpdk-dev 2/2] mempool: use shared memzone for rte_mempool_ops xiangxia.m.yue
2020-04-09 11:31   ` [dpdk-dev] [PATCH dpdk-dev 1/2] eal: introduce last-init queue for libraries initialization Jerin Jacob
2020-04-09 15:04     ` Tonghao Zhang
2020-04-09 15:02 ` [dpdk-dev] [PATCH dpdk-dev v2 1/2] eal: introduce rte-init " xiangxia.m.yue
2020-04-09 15:02   ` [dpdk-dev] [PATCH dpdk-dev v2 2/2] mempool: use shared memzone for rte_mempool_ops xiangxia.m.yue
2020-04-10  6:18   ` [dpdk-dev] [PATCH dpdk-dev v2 1/2] eal: introduce rte-init queue for libraries initialization Jerin Jacob
2020-04-10 13:11     ` Jerin Jacob
2020-04-12  3:20       ` Tonghao Zhang
2020-04-12  3:32         ` Tonghao Zhang
2020-04-13 11:32           ` Jerin Jacob
2020-04-13 14:21 ` [dpdk-dev] [PATCH dpdk-dev v3 " xiangxia.m.yue
2020-04-13 14:21   ` [dpdk-dev] [PATCH dpdk-dev v3 2/2] mempool: use shared memzone for rte_mempool_ops xiangxia.m.yue
2020-04-16 22:27     ` Thomas Monjalon
2020-04-27  8:03       ` Tonghao Zhang
2020-04-27 11:40         ` Thomas Monjalon
2020-04-27 12:51           ` Tonghao Zhang
2020-04-28 13:22             ` Tonghao Zhang
2020-05-04  7:42               ` Olivier Matz
2021-03-25 14:24                 ` David Marchand
2020-04-23 13:38     ` Andrew Rybchenko [this message]
2020-04-27  5:23       ` Tonghao Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3b1cb032-a1c2-6dee-af72-8ed690a9ad8c@solarflare.com \
    --to=arybchenko@solarflare.com \
    --cc=anatoly.burakov@intel.com \
    --cc=artem.andreev@oktetlabs.ru \
    --cc=bruce.richardson@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=gage.eads@intel.com \
    --cc=hemant.agrawal@nxp.com \
    --cc=jerinj@marvell.com \
    --cc=ndabilpuram@marvell.com \
    --cc=olivier.matz@6wind.com \
    --cc=vattunuru@marvell.com \
    --cc=xiangxia.m.yue@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.